All Projects → nokia → code-compass

nokia / code-compass

Licence: BSD-3-Clause license
a contextual search engine for software packages built on import2vec embeddings (https://www.code-compass.com)

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to code-compass

Vectorai
Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.
Stars: ✭ 195 (+490.91%)
Mutual labels:  search-engine, embeddings
Search Online
🔍A simple extension for VSCode to search online easily using search engine.
Stars: ✭ 115 (+248.48%)
Mutual labels:  search-engine, vscode-extension
Sitemap
Bolt Sitemap extension - create XML sitemaps for your Bolt website.
Stars: ✭ 19 (-42.42%)
Mutual labels:  search-engine
Keras-Application-Zoo
Reference implementations of popular DL models missing from keras-applications & keras-contrib
Stars: ✭ 31 (-6.06%)
Mutual labels:  embeddings
vscode-vtools
A collection of small tools for Visual Studio Code.
Stars: ✭ 20 (-39.39%)
Mutual labels:  vscode-extension
wolfram-notebook-embedder
JavaScript embedder for Wolfram Cloud notebooks
Stars: ✭ 48 (+45.45%)
Mutual labels:  embeddings
Network-Embedding-Resources
Network Embedding Survey and Resources
Stars: ✭ 43 (+30.3%)
Mutual labels:  embeddings
ncc
Neural Code Comprehension: A Learnable Representation of Code Semantics
Stars: ✭ 162 (+390.91%)
Mutual labels:  embeddings
vscode-saltstack
SaltStack extension for Microsoft Visual Studio Code
Stars: ✭ 26 (-21.21%)
Mutual labels:  vscode-extension
DataverseDevTools-VSCode
The all-in-one tool to develop code for Dataverse/Dynamics 365. Helps you connect to a Dataverse environment, generate TypeScript definitions for entities, create a different type of Dataverse-specific projects, and much more.
Stars: ✭ 18 (-45.45%)
Mutual labels:  vscode-extension
quit-control-vscode
➡️ Stop mistyping keyboard shortcuts and quitting VSCode unintentionally
Stars: ✭ 37 (+12.12%)
Mutual labels:  vscode-extension
upgreat
CLI for a painless way to upgrade your package.json dependencies!
Stars: ✭ 47 (+42.42%)
Mutual labels:  package-management
hohser
Highlight or Hide Search Engine Results
Stars: ✭ 89 (+169.7%)
Mutual labels:  search-engine
vscode-luogu
Solve Luogu Problems in VSCode
Stars: ✭ 62 (+87.88%)
Mutual labels:  vscode-extension
vscode-chat
Chat with your team while you collaborate over code using VS Live Share
Stars: ✭ 496 (+1403.03%)
Mutual labels:  vscode-extension
agda-mode-vscode
agda-mode on VS Code
Stars: ✭ 112 (+239.39%)
Mutual labels:  vscode-extension
powerplatform-vscode
The Power Platform VSCode extension makes it easy to manage Power Platform environments and allows the developer to create, build and deploy Power Platform solutions, packages and portals.
Stars: ✭ 74 (+124.24%)
Mutual labels:  vscode-extension
lda2vec
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (-18.18%)
Mutual labels:  embeddings
imsearch
Framework to build your own reverse image search engine
Stars: ✭ 64 (+93.94%)
Mutual labels:  search-engine
nexus-repository-conan
Conan the Barbarian, C packaging, fun times
Stars: ✭ 37 (+12.12%)
Mutual labels:  package-management

Marketplace Version

Code Compass is a contextual search engine for software packages developed at Nokia Bell Labs. It supercharges code reuse by recommending the best possible software libraries for your specific software project. See for yourself:

showcase

Code Compass is available as a website, a REST API and as an IDE plug-in for vscode.

We index packages hosted on NPM for JavaScript, PyPI for Python and Maven Central for Java.

If you're looking for the similarly named code comprehension tool from Ericsson to explore large codebases, look here. Apart from the name, there is no relationship (formal or informal) between that project and this one.

Why?

Modern software development is founded on code reuse through open source libraries and frameworks. These libraries are published in software package repositories, which are growing at an exponential rate. By building better software package search tools we aim to stimulate more code reuse and make software packages in the "long tail" more discoverable.

A gentle introduction to the why, what and how of Code Compass can be found in this introductory blog post.

What?

Code Compass is a contextual search engine for software packages.

Code Compass differs from other package search engines in that you can "seed" the search with names of libraries that you already know or use. We call these "context libraries". Code Compass then uses these context libraries to "anchor" the search in those technology stacks that are most relevant to your code.

When using the Visual Studio Code IDE extension there is no need to manually enter context libraries: Code Compass will automatically extract the import dependencies of the active source file to anchor its search.

Note that Code Compass will never send your code to the server. Only the names of third-party modules imported in your code are sent.

How?

Code Compass uses unsupervised machine learning to learn how to cluster similar software packages by their context of use, as determined by how libraries get imported alongside other libraries in large open source codebases.

Software packages are represented as vectors which we call "library vectors" by analogy with word vectors. Just like word2vec turns words into vectors by analyzing how words co-occur in large text corpora, our "import2vec" turns libraries into vectors by analyzing how import statements co-occur in large codebases.

You can read the details in our MSR 2019 paper. Supplementary material including trained library embeddings for Java, JavaScript and Python is available on Zenodo.

As an example, for Java we looked at a large number of open source projects on GitHub and libraries on Maven Central and studied how libraries are imported across these projects. We identified large clusters of projects related to web frameworks, cloud computing, network services and big data analytics. Well-known projects such as Apache Hadoop, Spark and Kafka were all clustered into the same region because they are commonly used together to support big data analytics.

Below is a 3D visualization (a t-SNE plot) of the learned vector space for Java. Each dot represents a Java library and the various colored clusters correspond to different niche areas that were discovered in the data. We highlighted the names of Apache projects.

3dviz

What's in this repo?

  • docs/: REST API docs for the Code Compass search service
  • plugins/vscode/: Visual Studio Code extension to integrate Code Compass into the IDE
  • scripts/: data extraction scripts to generate library import co-occurrences from source code
  • nbs/: Jupyter notebooks with TensorFlow models to train library embeddings from import co-occurrence data

Team

Code Compass is developed by a research team in the Application Platforms and Software Systems Lab of Nokia Bell Labs.

See CONTRIBUTORS for an alphabetic list of contributors to Code Compass.

Contributing

If you would like to train embeddings for other languages, have a look at the scripts under import2vec to get an idea of what data is required.

If you have suggestions for improvement, user feedback or want to report a bug, please open an issue in this repository.

License

BSD3

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].