Evaluates the quality of code snippets.
npm install @upstash/c7scorec7scoreThe c7score package is used to evaluate the quality of Upstash's Context7 code snippets.
c7score uses the following five metrics to grade quality. The metrics can be divided into two groups: LLM analysis and rule-based text analysis.Exports
1. getScore - evaluates a single library based on 5 metrics
2. compareLibraries - evaluates two similar libraries based on 5 metrics
3. scoreQA - evaluates how well code snippets answer provided questions.
.env file must always have the following:text
CONTEXT7_API_TOKEN=...
`This library can be used with both Vertex AI and the Gemini API. To use Vertex AI, the
.env file must contain the following:`text
VERTEX_AI=true
GOOGLE_CLOUD_PROJECT=...
GOOGLE_APPLICATION_CREDENTIALS=path_to_credentials
`If using the Gemini API:
`text
GEMINI_API_TOKEN=...
``typescript
import { getScore, compareLibraries, scoreQA } from "@upstash/c7score";await getScore(
"/websites/python_langchain",
1. What is a selector, and how do I use it?,
{
report: {
console: true,
folderPath: "results",
},
weights: {
question: 0.8,
llm: 0.05,
formatting: 0.05,
metadata: 0.05,
initialization: 0.05,
},
prompts: {
questionEvaluation: Evaluate ...,
},
}
);await compareLibraries(
"/tailwindlabs/tailwindcss.com",
"/websites/tailwindcss",
1. How can I install rust?;
{
report: {
console: true
},
llm: {
temperature: 0.95,
topP: 0.8,
topK: 45
},
prompts: {
questionEvaluation: Evaluate ...
}
}
);await scoreQA(
"How can I install LangChain Core?",
)
`$3
For
getScore and compareLibraries:`typescript
{
report: {
console: boolean;
folderPath: string;
humanReadable: boolean;
returnScore: boolean;
};
weights: {
question: number;
llm: number;
formatting: number;
metadata: number;
initialization: number;
};
llm: {
temperature: number;
topP: number;
topK: number;
candidateCount: number;
seed: number;
};
prompts: {
searchTopics: string;
questionEvaluation: string;
llmEvaluation: string;
};
}
`For
scoreQA:
`typescript
{
report: {
console: boolean;
}
llm: {
temperature: number;
topP: number;
topK: number;
candidateCount: number;
seed: number;
};
prompts: {
questionEvaluation: string;
}
}`Configuration Details
*
compareLibraries
* must have two libraries that have the same product
* will output results to result-compare.json and result-compare-LIBRARY_NAME.txt
* getScore
* will output machine-readable results to result.json and human-readable results to result-LIBRARY_NAME.txt in the specified directory
* scoreQA only returns the score and explanations or logs to the console.
* report
* console: true prints results to the console.
* folderPath specifies the folder for human-readable and machine-readable results (the folder must already exist).
* The machine-readable file will add or update the libraries.
* humanReadable writes the results to a txt file.
* returnScore returns the average score as a number for getScore and an object for compareLibraries.
* weights
* Specifies weight breakdown for evaluation metrics. If changing the weights, all must have an associated value (can be 0) and must sum to 1.
* llm
* LLM configuration options for Gemini
* Specific default numbers used to create more reproducible results.
* prompts
* Replaces the current prompts. It is not recommended to change the final output result instructions or score maximum (e.g., 100 -> 10)
* Each prompt accepts different placeholders, but they must be formatted as {{variableName}} with the correct associated variable name in the prompt (see Placeholder Reference).$3
| Prompt | For getScore | For compareLibraries | For scoreQA |
|-----------------|---------------------------------------------------|----------------------------------------------------------------------------------------|-----------------------------------|
| searchTopics | {{product}}, {{questions}} | – | – |
| questionEvaluation | {{contexts}}, {{questions}} | {{contexts[0]}}, {{contexts[1]}}, {{questions}} | {{context}}, {{question}} |
| llmEvaluation | {{snippets}}, {{snippetDelimiter}} | {{snippets[0]}}, {{snippets[1]}}, {{snippetDelimiter}} | – |$3
For
getScore and compareLibraries:
`typescript
{
report: {
console: true,
humanReadable: false,
returnScore: false,
},
weights: {
question: 0.8,
llm: 0.05,
formatting: 0.05,
metadata: 0.05,
initialization: 0.05,
},
llm: {
temperature: 0,
topP: 0.1,
topK: 1,
candidateCount: 1,
seed: 42,
}
}
`For
scoreQA:
`typescript
{
report: {
console: true,
},
llm: {
temperature: 0,
topP: 0.1,
topK: 1,
candidateCount: 1,
seed: 42,
}
}
`
* Note: scoreQA will always return the scores as objects
$3
Example output from scoreQA:`text
Score: 75
Explanation: The provided context contains several code snippets that show how to install @langchain/core for JavaScript/TypeScript using npm, yarn, pnpm, and bun.
However, it fails to provide the equivalent pip command for installing the langchain-core Python package, thus only partially answering the question.
`Example output from
getScore (compareLibraries is the same):
`text
== Average Score ==92
== Questions Score ==
100
== Questions Explanation ==
The context provides clear definitions and multiple comprehensive code examples that explain what a selector is in LangChain.
It thoroughly demonstrates how to use various types of selectors, such as
SemanticSimilarityExampleSelector and LengthBasedExampleSelector, by integrating them into few-shot prompt templates.== LLM Score ==
46
== LLM Explanation ==
The score is low due to a high rate of duplicate snippets, with over a quarter being identical copies. Additionally, a majority of snippets fail the syntax criterion due to errors, use of placeholders, and formatting issues that also impact clarity.
== Formatting Score ==
0
== Project Metadata Score ==
100
== Initialization Score ==
100
``