Skip to content
@southern-cross-ai

Southern Cross AI

Australia's First Large Language Model Research Initiative
Southern Cross AI   Website   Discord   Website

✨ Welcome to Southern Cross AI ✨
We aim at developing Australia's First Open-Source Large Language Model
through collaborations across academia, research, government, and business sectors.

Wanna make friends and munch some snacks? Let's Meetup!

Join our exciting 12-week (Aug 5 - Oct 7) Meetup events held every Monday:

Next upcoming event: Little Joey 7b Training and Improvement Session

New kid in town? No worries, we got you!

Onboard LLMs

LLM Battleground

LLM Playground

Misc

Call for Contributors - We need your magic to make things happen

  • Data Source Contributor 🕵️‍♀️
    • Identify and provide access to Australia-related data sources.
    • Collaborate with other contributors to ensure data quality and relevance.
  • Data Collecting, Crawling and Scraping 👩‍🌾
    • Develop scripts and tools to collect data from various sources.
    • (Optional) Have experience with web scraping tools (e.g., BeautifulSoup, Scrapy).
  • Data Cleaning 👩‍⚕️
    • Clean and preprocess datasets to ensure they are ready for analysis and modeling.
    • (Optional) Have experience with data manipulation libraries (e.g., Pandas, NumPy).
  • Model Building, Training and Tuning 👩‍💻
    • Develop and train LLMs to solve with our datasets.
    • Have experience with machine learning frameworks (e.g., TensorFlow, PyTorch).
  • GitHub Organising 👩‍🔧
    • Manage the GitHub repository by organizing files, documentation, and issues.
    • (Optional) Have proficiency in using Git and GitHub.
  • Hugging Face Organising 👩‍🏭
    • Manage and organize model versions and datasets.
    • Ensure proper documentation and metadata for each model and dataset.
  • Social Media Organising 👩‍💼
    • Promote the project and its updates on social media platforms (e.g., Discord, Meetup).
    • Engage with the community to increase project visibility and collaboration.

Can't wait to join us? Send a message to our lovely team members:

Pinned Loading

  1. Dataset-Repo-Template Dataset-Repo-Template Public template

    A Template for Creating Your Dataset Repos

    1

  2. Gutenberg-Data Gutenberg-Data Public

    HTML 3 2

Repositories

Showing 10 of 35 repositories
  • BabyJoey Public

    Small 115 million parameter model - .5GB

    southern-cross-ai/BabyJoey’s past year of commit activity
    Jupyter Notebook 3 Apache-2.0 7 0 3 Updated Sep 23, 2024
  • .github Public

    These are the default community health files for Southern Cross AI's GitHub profile.

    southern-cross-ai/.github’s past year of commit activity
    0 Apache-2.0 0 0 0 Updated Sep 16, 2024
  • Braided-Channels Public

    Interview Dateset from the Braided Channels Research Collection

    southern-cross-ai/Braided-Channels’s past year of commit activity
    Jupyter Notebook 1 MIT 0 1 0 Updated Sep 7, 2024
  • OpenAustralia Public

    Dataset of House and Senate Debates from Australian Parliament

    southern-cross-ai/OpenAustralia’s past year of commit activity
    HTML 1 MIT 0 1 0 Updated Sep 5, 2024
  • Inside-Airbnb-Australia Public

    Airbnb's Residential Dataset (Australia)

    southern-cross-ai/Inside-Airbnb-Australia’s past year of commit activity
    Jupyter Notebook 1 MIT 0 1 0 Updated Sep 5, 2024
  • ICE-AUS Public

    Corpus Dataset from Australian component of the International Corpus of English (ICE-AUS)

    southern-cross-ai/ICE-AUS’s past year of commit activity
    Python 1 MIT 0 0 0 Updated Sep 5, 2024
  • CoANZSE Public

    Dataset from Corpus of Australian and New Zealand Spoken English (CoANZSE)

    southern-cross-ai/CoANZSE’s past year of commit activity
    Python 1 0 0 0 Updated Sep 5, 2024
  • southern-cross-ai/Dewr-data’s past year of commit activity
    0 0 1 0 Updated Aug 27, 2024
  • AU-website Public

    This is the fo

    southern-cross-ai/AU-website’s past year of commit activity
    Python 0 MIT 0 1 0 Updated Aug 25, 2024
  • Youtube-Data Public

    To crawl comments from Youtube with official API

    southern-cross-ai/Youtube-Data’s past year of commit activity
    Python 0 MIT 0 0 0 Updated Aug 24, 2024

Top languages

Loading…

Most used topics

Loading…