mirror of
https://git.adityakumar.xyz/llama.cpp.git
synced 2024-11-09 15:29:43 +00:00
Update README.md
This commit is contained in:
parent
8a01f565ff
commit
b85028522d
1 changed files with 52 additions and 34 deletions
86
README.md
86
README.md
|
@ -17,11 +17,11 @@ The main goal is to run the model using 4-bit quantization on a MacBook.
|
||||||
|
|
||||||
This was hacked in an evening - I have no idea if it works correctly.
|
This was hacked in an evening - I have no idea if it works correctly.
|
||||||
|
|
||||||
So far, I've tested just the 7B model and the generated text starts coherently, but typically degrades significanlty after ~30-40 tokens.
|
So far, I've tested just the 7B model.
|
||||||
Here is a "typical" run:
|
Here is a "typical" run:
|
||||||
|
|
||||||
```java
|
```java
|
||||||
make -j && ./main -m ./models/7B/ggml-model-q4_0.bin -t 8 -n 128
|
make -j && ./main -m ../LLaMA-4bit/7B/ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -t 8 -n 512
|
||||||
I llama.cpp build info:
|
I llama.cpp build info:
|
||||||
I UNAME_S: Darwin
|
I UNAME_S: Darwin
|
||||||
I UNAME_P: arm
|
I UNAME_P: arm
|
||||||
|
@ -32,33 +32,16 @@ I LDFLAGS: -framework Accelerate
|
||||||
I CC: Apple clang version 14.0.0 (clang-1400.0.29.202)
|
I CC: Apple clang version 14.0.0 (clang-1400.0.29.202)
|
||||||
I CXX: Apple clang version 14.0.0 (clang-1400.0.29.202)
|
I CXX: Apple clang version 14.0.0 (clang-1400.0.29.202)
|
||||||
|
|
||||||
c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread main.cpp ggml.o utils.o -o main -framework Accelerate
|
make: Nothing to be done for `default'.
|
||||||
./main -h
|
main: seed = 1678486056
|
||||||
usage: ./main [options]
|
llama_model_load: loading model from '../LLaMA-4bit/7B/ggml-model-q4_0.bin' - please wait ...
|
||||||
|
|
||||||
options:
|
|
||||||
-h, --help show this help message and exit
|
|
||||||
-s SEED, --seed SEED RNG seed (default: -1)
|
|
||||||
-t N, --threads N number of threads to use during computation (default: 4)
|
|
||||||
-p PROMPT, --prompt PROMPT
|
|
||||||
prompt to start generation with (default: random)
|
|
||||||
-n N, --n_predict N number of tokens to predict (default: 128)
|
|
||||||
--top_k N top-k sampling (default: 40)
|
|
||||||
--top_p N top-p sampling (default: 0.9)
|
|
||||||
--temp N temperature (default: 0.8)
|
|
||||||
-b N, --batch_size N batch size for prompt processing (default: 8)
|
|
||||||
-m FNAME, --model FNAME
|
|
||||||
model path (default: models/llama-7B/ggml-model.bin)
|
|
||||||
|
|
||||||
main: seed = 1678476633
|
|
||||||
llama_model_load: loading model from './models/7B/ggml-model-q4_0.bin' - please wait ...
|
|
||||||
llama_model_load: n_vocab = 32000
|
llama_model_load: n_vocab = 32000
|
||||||
llama_model_load: n_ctx = 512
|
llama_model_load: n_ctx = 512
|
||||||
llama_model_load: n_embd = 4096
|
llama_model_load: n_embd = 4096
|
||||||
llama_model_load: n_mult = 256
|
llama_model_load: n_mult = 256
|
||||||
llama_model_load: n_head = 32
|
llama_model_load: n_head = 32
|
||||||
llama_model_load: n_layer = 32
|
llama_model_load: n_layer = 32
|
||||||
llama_model_load: n_rot = 64
|
llama_model_load: n_rot = 128
|
||||||
llama_model_load: f16 = 2
|
llama_model_load: f16 = 2
|
||||||
llama_model_load: n_ff = 11008
|
llama_model_load: n_ff = 11008
|
||||||
llama_model_load: ggml ctx size = 4529.34 MB
|
llama_model_load: ggml ctx size = 4529.34 MB
|
||||||
|
@ -66,24 +49,59 @@ llama_model_load: memory_size = 512.00 MB, n_mem = 16384
|
||||||
llama_model_load: .................................... done
|
llama_model_load: .................................... done
|
||||||
llama_model_load: model size = 4017.27 MB / num tensors = 291
|
llama_model_load: model size = 4017.27 MB / num tensors = 291
|
||||||
|
|
||||||
main: prompt: 'If'
|
main: prompt: 'Building a website can be done in 10 simple steps:'
|
||||||
main: number of tokens in prompt = 2
|
main: number of tokens in prompt = 15
|
||||||
1 -> ''
|
1 -> ''
|
||||||
3644 -> 'If'
|
8893 -> 'Build'
|
||||||
|
292 -> 'ing'
|
||||||
|
263 -> ' a'
|
||||||
|
4700 -> ' website'
|
||||||
|
508 -> ' can'
|
||||||
|
367 -> ' be'
|
||||||
|
2309 -> ' done'
|
||||||
|
297 -> ' in'
|
||||||
|
29871 -> ' '
|
||||||
|
29896 -> '1'
|
||||||
|
29900 -> '0'
|
||||||
|
2560 -> ' simple'
|
||||||
|
6576 -> ' steps'
|
||||||
|
29901 -> ':'
|
||||||
|
|
||||||
sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000
|
sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000
|
||||||
|
|
||||||
|
|
||||||
If you are a fan of the original Star Wars trilogy, then you'll want to see this.
|
Building a website can be done in 10 simple steps:
|
||||||
If you don't know your Star Wars lore, this will be a huge eye-opening and you will be a little confusing.
|
1) Select a domain name and web hosting plan
|
||||||
Awesome movie. [end of text]
|
2) Complete a sitemap
|
||||||
|
3) List your products
|
||||||
|
4) Write product descriptions
|
||||||
|
5) Create a user account
|
||||||
|
6) Build the template
|
||||||
|
7) Start building the website
|
||||||
|
8) Advertise the website
|
||||||
|
9) Provide email support
|
||||||
|
10) Submit the website to search engines
|
||||||
|
A website is a collection of web pages that are formatted with HTML. HTML is the code that defines what the website looks like and how it behaves.
|
||||||
|
The HTML code is formatted into a template or a format. Once this is done, it is displayed on the user's browser.
|
||||||
|
The web pages are stored in a web server. The web server is also called a host. When the website is accessed, it is retrieved from the server and displayed on the user's computer.
|
||||||
|
A website is known as a website when it is hosted. This means that it is displayed on a host. The host is usually a web server.
|
||||||
|
A website can be displayed on different browsers. The browsers are basically the software that renders the website on the user's screen.
|
||||||
|
A website can also be viewed on different devices such as desktops, tablets and smartphones.
|
||||||
|
Hence, to have a website displayed on a browser, the website must be hosted.
|
||||||
|
A domain name is an address of a website. It is the name of the website.
|
||||||
|
The website is known as a website when it is hosted. This means that it is displayed on a host. The host is usually a web server.
|
||||||
|
A website can be displayed on different browsers. The browsers are basically the software that renders the website on the user’s screen.
|
||||||
|
A website can also be viewed on different devices such as desktops, tablets and smartphones. Hence, to have a website displayed on a browser, the website must be hosted.
|
||||||
|
A domain name is an address of a website. It is the name of the website.
|
||||||
|
A website is an address of a website. It is a collection of web pages that are formatted with HTML. HTML is the code that defines what the website looks like and how it behaves.
|
||||||
|
The HTML code is formatted into a template or a format. Once this is done, it is displayed on the user’s browser.
|
||||||
|
A website is known as a website when it is hosted
|
||||||
|
|
||||||
main: mem per token = 14434244 bytes
|
main: mem per token = 14434244 bytes
|
||||||
main: load time = 1313.77 ms
|
main: load time = 1332.48 ms
|
||||||
main: sample time = 6.17 ms
|
main: sample time = 1081.40 ms
|
||||||
main: predict time = 3271.53 ms / 54.53 ms per token
|
main: predict time = 31378.77 ms / 61.41 ms per token
|
||||||
main: total time = 4797.98 ms
|
main: total time = 34036.74 ms
|
||||||
```
|
```
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
Loading…
Reference in a new issue