Labelbox is focused on building a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services.
Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.
Visit Labelbox for more information.
If you haven't already, create a free account at Labelbox.
Log into Labelbox and navigate to Account > API Keys to generate an API key.
To install the SDK, run the following command.
pip install labelbox
If you'd like to install the SDK with enhanced functionality, which additional optional capabilities surrounding data processing, run the following command.
pip install "labelbox[data]"
If you want to installed a version of Labelbox built locally, be aware that only tagged commits have been validated to fully work! Installing the latest from develop is at your own risk!
After installing the SDK and getting an API Key, it's time to validate them both.
import labelbox as lb
client = lb.Client(API_KEY) # API_KEY = API Key generated from labelbox.com
dataset = client.create_dataset(name="Test Dataset")
data_rows = [{"row_data": "My First Data Row", "global_key": "first-data-row"}]
task = dataset.create_data_rows(data_rows)
task.wait_till_done()
You should be set! Running the snippet above should create a dataset called Test Dataset
with a single datarow with the text contents being My First Data Row
. You can log into Labelbox to verify this. If you have any issues please file a Github Issue or contact Labelbox Support directly. For more advanced examples and information on the SDK, see Documentation below.
We encourage anyone to contribute to this repository to help improve it. Please refer to Contributing Guide for detailed information on how to contribute. This guide also includes instructions for how to build and run the SDK locally.
Using the GPT repository loader, we have created lbx_prompt.txt
that contains data from all .py
and .md
files. The file has about 730k tokens. We recommend using Gemini 1.5 Pro with 1 million context length window.
The SDK is well-documented to help developers get started quickly and use the SDK effectively. Here are some resources: