Skip to content
View SulRash's full-sized avatar
🤖
Training teeny tiny models
🤖
Training teeny tiny models

Highlights

  • Pro

Block or report SulRash

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
SulRash/README.md

Hi there! I'm Sultan!

Homepage

🤖 AI researcher working on multilingual NLP, smol language modelling, huge language models, and educational AI! Previously helped build ALLaM, a state-of-the-art Arabic-English language model :)

📚 Recent Publications

  • SmolTulu - Highest performing sub 2B model on reasoning benchmarks - An investigation into learning rate & batch size ratios
  • Fineweb-Edu-Ar - Largest open-source machine translated Arabic educational dataset
  • ALLaM - State-of-the-art Arabic-English LLM
  • When Benchmarks are Targets - Analysis of LLM evaluation sensitivity (ACL 2024)

🚀 Projects

🌍 Let's connect! Find me on LinkedIn!

Pinned Loading

  1. envenc Public

    Repository for environment encoder, an attempt at improving reinforcement learning agents' generalisability through learning how to act on universal multimodal embeddings generated by a vision-lang…

    Python 3

  2. deepspeedai/Megatron-DeepSpeed Public

    Forked from NVIDIA/Megatron-LM

    Ongoing research training transformer language models at scale, including: BERT & GPT-2

    Python 2k 353

  3. minLLMTrain Public

    Minimal yet high performant code for pretraining llms. Attempts to implement some SOTA features. Implements training through: Deepspeed, Megatron-LM, and FSDP. WIP

    Python 6

  4. Cheatsheet Public

    An attempt at improving facial recognition performance through appending a 'cheatsheet' to an image with one positive sample and multiple negatives during training.

    Python 6

  5. huggingface-text-data-analyzer Public

    Analyzes text datasets from huggingface for training LLMs!

    Python 7

  6. AnshulSood11/Engagement-Level-Prediction Public

    Engagement Intensity Prediction in Real TIme

    C++ 16 9

882 contributions in the last year

Contribution Graph
Day of Week April May June July August September October November December January February March
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Less
No contributions.
Low contributions.
Medium-low contributions.
Medium-high contributions.
High contributions.
More

Activity overview

Contributed to SulRash/envenc, SulRash/ntaGPT, SulRash/huggingface-text-data-analyzer and 16 other repositories
Loading A graph representing SulRash's contributions from March 31, 2024 to April 01, 2025. The contributions are 96% commits, 3% pull requests, 1% issues, 0% code review.   Code review 1% Issues 3% Pull requests 96% Commits

Contribution activity

April 1, 2025

SulRash has no activity yet for this period.
Loading