wip notes:
- RocksDB-level metrics would be useful
!! Replica set failover causes any in-flight requests to time out (and get dropped) !! RPC error handling is bad. Need to have a way to serialize app-level errors across the boundary
Thoughts on “Leased” operations:
- Must skip raft loop for acceptable perf
- need to handle failover case. Shard failover could cause “lost updates” if we skip raft
- Gain a lease to a specific shard in the replica set. This shard serves all read traffic?
- can clients read from a replica? ES seems to think this is OK
- if so, still need to obtain a lease to know when the replica is fully up to date
- Ultimately, need to guarantee all reads after a refresh are consistent
Missing Validation:
- Internal mappings fields
- Gossip shard/index routing information
- Done for indices. Need to figure out when to refresh it and do it for shard-level routing
- Update local cache from gossips
- Check meta layer synchronously if cache missed / was wrong
Deletes are hard - elasticsearch uses a time window
- Strings
- Longs
- Objects
- https://github.com/cockroachdb/cockroach/blob/master/docs/design.md#range-leases https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20160210_range_leases.md
Shouldn’t have to force the user to configure this, we should figure out node ids from address or something else
- Max field count is 255
- Threading is weird. I want to manage the thread pool across many indices
- No way to have a user-defined doc id
- Avoid dirty reads!
- Can we provide even better consistency guarantees?
- Joins!
- Autoscaling
- Split shards at certain conditions
- Add replicas at certain conditions
Elasticsearch ids: https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/index/mapper/Uid.java#L178