CodeMaker AI Recreates 90,000 Lines of Code with 91% similarity
2024年8月6日 - 12:39AM
CodeMaker AI has demonstrated the capability of its fine-tuned AI
model to autonomously recreate a 90,000-line software library with
91% similarity to the original. CodeMaker AI system processed 3,251
files and generated 90,063 lines of code within just 1 hour and 42
minutes, at a total cost of $265.73. The generated artifact has
been published on GitHub (source). Estimates based on the COCOMO
model suggest that manually writing such code would take
approximately 25 years.
This achievement was made possible through the development of a
custom fine-tuning pipeline, enabling the AI model to be trained on
an entire codebase. This allows users to fine-tune their own
models, tailoring them to specific codebases, coding styles, and
domain spaces.
The primary goal of this experiment was to validate the accuracy
and reliability of the fine-tuned models. These models can now be
applied across a range of feature sets, including automated code
generation, code completion, and enhanced chat experiences, among
others.
This development builds on previous experimentation conducted
with smaller-scale Java codebases. Earlier efforts focused on
recreating the implementation code of the open-source libraries
Mockito (source) and JUnit (source).
CodeMaker AI, founded in March 2023 and based in Vancouver, BC,
is a startup dedicated to providing innovative tools and automation
solutions for software developers. The company's offerings focus on
streamlining the writing, testing, and documentation of source
code.
CodeMaker AI Inc.Jakub Narlochemail:
contact@codemaker.aiwebsite: https://codemaker.ai