Skip to content

xcodemind/naturalcc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

No longer maintained, the latest version is here: CGCL-codes/naturalcc.

NaturalCC v0.4.0

NaturalCC is a sequence modeling toolkit that allows researchers and developers to train custom models for many software engineering tasks, e.g., code summarization, code retrieval and code clone detection. Our vision is to bridge the gap between programming language and natural language via some machine learning techniques.

About us: XCodeMind

NaturalCC demo page: NCC demo

This repository is an ongoing project and we are willing to invite you to attend its development. If you meet any bug or problem while using, feel free to contact us and we will try our best to help you. On the other hand, if you want to merge your workflow into this project, please apply to push your requests.

The project is inspired by fairseq. Thanks for its appearance.

Version


Features

  • mixed precision training
  • multi-gpu training
  • raw/bin data reading/writing

Code Tasks

TBC...

Dataset

Currently, we have processed the following datasets:

TBC...

TBC:

Please wait.

Requirements

  • PyTorch version >= 1.4.0
  • Python version >= 3.6
  • For training new models, you'll also need an NVIDIA GPU and NCCL
  • For faster training install NVIDIA's apex library with the --cuda_ext and --deprecated_fused_adam options

Installation

1) Install apex

to support half precision training.

git clone https://github.com/NVIDIA/apex.git
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .

2) Install other prerequisite libraries

git clone https://github.com/xcodemind/naturalcc
cd naturalcc
pip install -r requirements.txt

# or install with conda 
# conda install --yes --file requirements.txt

BTW, install.md supports virtual environment installation in details. If you meet problems in installation, you can refer to the file.

3) Install NCC

# build for development 
python setup.py build_ext --inplace

# install 
pip install --editable ./

License

NaturalCC is MIT-licensed. The license applies to the pre-trained models as well.

Citation

Please cite as: xxx