mirror of
https://git.adityakumar.xyz/llama.cpp.git
synced 2024-11-14 00:59:43 +00:00
884e7d7a2b
* ggml : use 8-bit precision for Q4_1 intermediate results (ARM) * ggml : optimize ggml_vec_dot_q4_1_q8_0() via vmalq_n_f32 56 ms/token with Q4_1 ! * ggml : AVX2 implementation of ggml_vec_dot_q4_1_q8_0 (#1051) * gitignore : ignore ppl-*.txt files --------- Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
42 lines
391 B
Text
42 lines
391 B
Text
*.o
|
|
*.a
|
|
.DS_Store
|
|
.build/
|
|
.cache/
|
|
.direnv/
|
|
.envrc
|
|
.swiftpm
|
|
.venv
|
|
.vs/
|
|
.vscode/
|
|
|
|
build/
|
|
build-em/
|
|
build-debug/
|
|
build-release/
|
|
build-static/
|
|
build-no-accel/
|
|
build-sanitize-addr/
|
|
build-sanitize-thread/
|
|
|
|
models/*
|
|
|
|
/main
|
|
/quantize
|
|
/quantize-stats
|
|
/result
|
|
/perplexity
|
|
/embedding
|
|
/benchmark-q4_0-matmult
|
|
/vdot
|
|
/Pipfile
|
|
|
|
arm_neon.h
|
|
compile_commands.json
|
|
|
|
__pycache__
|
|
|
|
zig-out/
|
|
zig-cache/
|
|
|
|
ppl-*.txt
|