Commit graph

209 commits

Author SHA1 Message Date
Georgi Gerganov
84d9015c4a
Use vdotq_s32 to improve performance (#67)
* 10% performance boost on ARM

* Back to original change
2023-03-13 18:36:44 +02:00
Georgi Gerganov
c80e2a8f2a
Revert "10% performance boost on ARM"
This reverts commit 113a9e83eb.

There are some reports for illegal instruction.
Moved this stuff to vdotq_s32 branch until resolve
2023-03-13 01:28:08 +02:00
Georgi Gerganov
54a0e66ea0
Check for vdotq_s32 availability 2023-03-13 01:21:03 +02:00
Georgi Gerganov
543c57e991
Ammend to previous commit - forgot to update non-QRDMX branch 2023-03-13 01:05:24 +02:00
Georgi Gerganov
113a9e83eb
10% performance boost on ARM 2023-03-13 00:56:10 +02:00
Sebastián A
eb062bb012
Windows fixes (#31)
* Apply fixes suggested to build on windows

Issue: https://github.com/ggerganov/llama.cpp/issues/22

* Remove unsupported VLAs

* MSVC: Remove features that are only available on MSVC C++20.

* Fix zero initialization of the other fields.

* Change the use of vector for stack allocations.
2023-03-12 22:15:00 +02:00
Georgi Gerganov
f1eaff4721 Add AVX2 support for x86 architectures thanks to @Const-me ! 2023-03-11 18:04:25 +02:00
Georgi Gerganov
007a8f6f45
Support all LLaMA models + change Q4_0 quantization storage 2023-03-11 11:28:30 +02:00
Georgi Gerganov
26c0846629
Initial release 2023-03-10 20:56:40 +02:00