-
Notifications
You must be signed in to change notification settings - Fork 604
Student Projects
Below is the list of tasks that are good for student projects (course or graduate work).
YDB has a Coordination Service which allow your client application to elect a leader via distributed lock (similar to ZooKeeper). The task is to add support for Coordination Service to Rust SDK.
Mentor: Timofey Koolin (https://github.com/rekby)
Add support for sqlx.StructScan() from github.com/jmoiron/sqlx to Go SDK.
Mentor: Aleksei Miasnikov (https://github.com/asmyasnikov)
Add capability of loading plugins implemented via so/dll into C++ SDK.
Example: YDB supports different authorisation mechanisms, it's a good idea to implement them as plugins to keep code dependancies clear.
Mentor: Daniil Cherednik (https://github.com/dcherednik)
Out of the box monitoring for you client application is awesome. We have some ideas how to extend C++ SDK Monitoring facilities.
TODO: detailed description
Mentor: Daniil Cherednik (https://github.com/dcherednik)
Current C++ sdk implementation requires to call driver.Stop(true) method at the end of program. There are some internal sdk routines which can invoke gRpc calls out of user call context but gRpc does not allow this call after exit from main function. Such approach (to call driver.Stop(true)) is not convenient for real application because often it is difficult to control place where driver is constructed. The simplest solution is to make driver as a singletone object. Singletone usage is reasonable here because driver is able to work with multiple databases or with multiple clusters effectively sharing threads, connections and other grpc resources. Other solutions (using atexit fuction) are still possible to discuss. This task requires good knowledge of multitheading programming, ability to write portability code.
Mentor:Daniil Cherednik (https://github.com/dcherednik)
Currently YDB CLI (https://ydb.tech/en/docs/getting_started/cli) doesn't support interactive mode. Interactive mode means that you can run the ydb
program and it will provide you a way to write queries and get responses something like the psql
program does.
Mentor: Nikolay Perfilov (https://github.com/pnv1)
Currently YDB CLI (https://ydb.tech/en/docs/getting_started/cli) supports CSV and TSV input formats only. There're lots of other common formats we should support here. Such as JSON, Parquet, Avro, MessagePack, Debezium (over JSON or Avro), ORC, Protobuf, and so on. You could be interested in this task
- if you want to know how modern systems serialize their data
- if you want to get experience in data transfer between such systems
Mentor: ???
If you want to dive into YDB's core, this is the task you are looking for. Writing and reading from/to YDB Distributed Storage (DS) effectively is very important. For every write to DS YDB's component dsproxy generates several messages TEvVPut (https://github.com/ydb-platform/ydb/blob/main/ydb/core/blobstorage/vdisk/common/vdisk_events.h#L504) because we write multiple replicas or erasure parts. TEvVPut message is serialised to go into the wire. The task is to optimize TEvVPut serialization. Currently we use Google protobuf for message serialization, but options are:
- Use google protobuf for metadata serialization only, but don't put opaque data into proto message. Put it next after protobuf message;
- Use flat buffers;
- Use custom protocol. We expect that you think propose some solutions, implement them and compare performance via benchmark.
Mentor: Aleksey Stankevichus (https://github.com/the-ancient-1)
YDB uses lwtrace library for tracing events in the system and to debug issues.
TODO: add detailed description
Mentor: Aleksey Stankevichus (https://github.com/the-ancient-1)
TODO: add detailed description
Mentor: Andrey Fomichev (https://github.com/fomichev3000)
There is currently a partial solution for this project. You can find the solution here But this solution has significant disadvantages: low performance and support for not all etcd services, such as Auth, Lease, Cluster, Maintenance, Watch, AlarmMember and etc. The full list of services can be found here This project offers to take part in this competition and solve this problems
Language: C++
Mentor: Oleg Doronin (https://github.com/dorooleg)
https://github.com/ydb-platform/ydb/issues/101
Mentor: Ilnaz Nizametdinov (https://github.com/CyberROFL)
Nowadays r7 office is getting more and more popular. R7 office allows to import data from external databases. We can already find the similar solutions for other databases, for example ClickHouse https://github.com/in2sql/xldb-r7-free This project proposes to implement this functionality for the YDB
Language: JavaScript
Mentor: Oleg Doronin (https://github.com/dorooleg)
YDB Federated Query allows users to perform a joint analysis of the data stored in various external data sources. A special service written in Go, fq-connector-go, provides YDB with an abstraction layer of a generic data source, hiding the peculiarities and details of the particular sources. Currently we are focused on extending the list of supported data sources with traditional RDBMs (like PostgreSQL, MySQL, MS SQL, Oracle an so on). However, YDB Federated Query is not limited to them, so we are looking towards support of NoSQL databases, object storages, message queues, monitoring systems, log storages.
Languages: Go, C++, Python
Mentor: Vitaly Isaev (https://github.com/vitalyisaev2)