Skip to content
@AlignmentResearch

FAR.AI

Frontier alignment research to ensure the safe development and deployment of advanced AI systems.

Popular repositories Loading

  1. tuned-lens tuned-lens Public

    Tools for understanding how transformer predictions are built layer-by-layer

    Python 463 48

  2. go_attack go_attack Public

    Python 84 7

  3. vlmrm vlmrm Public

    Python 47 12

  4. gpt-4-novel-apis-attacks gpt-4-novel-apis-attacks Public

    19 1

  5. learned-planner learned-planner Public

    Interpretability tools for recurrent networks that play Sokoban

    Python 10 2

  6. scaling-poisoning scaling-poisoning Public

    Python 6

Repositories

Showing 10 of 34 repositories
  • HarmBench Public Forked from centerforaisafety/HarmBench

    Fork of HarmBench for getting R2D2 working

    AlignmentResearch/HarmBench’s past year of commit activity
    Jupyter Notebook 0 MIT 64 0 0 Updated Feb 1, 2025
  • train-learned-planner Public

    Experimenting with CleanRL for learned-planners

    AlignmentResearch/train-learned-planner’s past year of commit activity
    Python 4 0 1 2 Updated Jan 31, 2025
  • AlignmentResearch/KataGoVisualizer’s past year of commit activity
    HTML 3 MIT 1 6 0 Updated Jan 29, 2025
  • gym-sokoban Public

    Sokoban environment for Gym

    AlignmentResearch/gym-sokoban’s past year of commit activity
    Python 0 MIT 0 0 0 Updated Jan 27, 2025
  • AlignmentResearch/oauth2-proxy-buildpack’s past year of commit activity
    Shell 0 Apache-2.0 0 0 0 Updated Jan 27, 2025
  • farconf Public

    Easy dataclass-based configuration for ML projects

    AlignmentResearch/farconf’s past year of commit activity
    Python 1 0 0 0 Updated Jan 17, 2025
  • go_attack Public
    AlignmentResearch/go_attack’s past year of commit activity
    Python 84 MIT 7 12 0 Updated Jan 15, 2025
  • AlignmentResearch/KataGo-custom’s past year of commit activity
    C++ 5 1 6 0 Updated Jan 15, 2025
  • envpool Public Forked from sail-sg/envpool

    C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.

    AlignmentResearch/envpool’s past year of commit activity
    C++ 0 Apache-2.0 111 0 0 Updated Jan 15, 2025
  • learned-planner Public

    Interpretability tools for recurrent networks that play Sokoban

    AlignmentResearch/learned-planner’s past year of commit activity
    Python 10 Apache-2.0 2 0 0 Updated Jan 15, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…