Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add some basic structure for quantizer, io and simd operators #25

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

LHT129
Copy link
Collaborator

@LHT129 LHT129 commented Sep 12, 2024

  • only header for quantizer and io
  • fp32 quantizer as example


private:
[[nodiscard]] inline bool
CheckValidOffset(uint64_t size) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unify the naming convention; our private methods generally use the underscore naming style. (check_valid_offset)

#include "fixtures.h"
using namespace vsag;

#define TEST_RECALL(Func) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming it to TEST_ACCURACY would be more accurate.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

auto val = query[i] - codes[i];
result += val * val;
}
return result;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no sqrt in the result. So, it is better to rename it to "FP32ComputeL2".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L2Sqr is the Square of L2 norm

private:
uint64_t dim_{0};

uint64_t codeSize_{0};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unify the code style.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codeSize -> code_size_

@LHT129 LHT129 force-pushed the lht_dev branch 2 times, most recently from f1ae5ea to ca3eda5 Compare September 12, 2024 09:36
@LHT129 LHT129 requested a review from inabao September 13, 2024 02:46
@LHT129 LHT129 force-pushed the lht_dev branch 2 times, most recently from eb24b83 to 414ce55 Compare September 13, 2024 02:59
auto* outVec = new float[dim];
quant.DecodeOne(codes, outVec);
for (int i = 0; i < dim; ++i) {
REQUIRE(std::abs(vecs[idx * dim + i] - outVec[i]) < error);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should replace abs with fabs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For integral arguments, the integral overloads of std::abs are likely better matches.
reference: https://en.cppreference.com/w/cpp/numeric/math/fabs

sum = _mm256_add_ps(sum, _mm256_mul_ps(a, b)); // accumulate the product
}
alignas(32) float result[8];
_mm256_store_ps(result, sum); // store the accumulated result into an array
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When n == 0, the store instruction (e.g. _mm256_store_ps) still occur in the avx512, avx2, and sse operators that may be harmful to the performance. Is it possible to add a judgment to skip these store instructions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I have fixed it

} else if (metric == vsag::MetricType::METRIC_TYPE_L2SQR) {
gt = L2Sqr(data + idx1 * dim, query + i * dim, &dim);
}
REQUIRE(std::abs(gt - value) < 1e-4);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fabs?

Copy link
Collaborator

@inabao inabao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@LHT129 LHT129 force-pushed the lht_dev branch 6 times, most recently from 24002ac to d5e42ef Compare September 14, 2024 07:07
Copy link
Collaborator

@wxyucs wxyucs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm


namespace vsag {

template <typename IOTmpl>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOTmpl need any interface limit ?

return ret;
}
void
MemoryIO::PrefetchImpl(uint64_t offset, uint64_t cacheLine) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicit mark cacheLine to unused ?
cacheLine -> cache_line


namespace vsag {

template <MetricType Metric = MetricType::METRIC_TYPE_L2SQR>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metric -> metric

}

inline void
Prefetch(uint64_t offset, uint64_t cacheLine = 64) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cacheLine -> cache_line

MultiReadImpl(uint8_t* datas, uint64_t* sizes, uint64_t* offsets, uint64_t count) const;

inline void
PrefetchImpl(uint64_t offset, uint64_t cacheLine = 64);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cacheLine -> cache_line


uint64_t codeSize_{0};

bool isTrained_{false};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isTrained_ -> is_trained_

private:
uint64_t dim_{0};

uint64_t codeSize_{0};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codeSize -> code_size_

} else if (Metric == MetricType::METRIC_TYPE_COSINE) {
return InnerProduct(codes1, codes2, &this->dim_); // TODO
} else {
return 0.;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.0 ?

@wxyucs wxyucs added the kind/improvement Code improvements (variable/function renaming, refactoring, etc. ) label Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/improvement Code improvements (variable/function renaming, refactoring, etc. )
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants