Show HN: Never train another ML model again
github.comHello, Hacker News community!
I made FlashLearn, an open-source library designed to streamline the integration of Large Language Models (LLMs) into your workflows. With FlashLearn, you can effortlessly build JSON-based pipelines for tasks like classification and labeling using just a few lines of code, while maintaining standardized outputs for seamless downstream processing.
Key Features:
* Quick Setup: Install FlashLearn via PyPI:
```
pip install flashlearn
```
Additionally, set your LLM provider credentials if you're using OpenAI / Deepseek:
```
export OPENAI_API_KEY="YOUR_API_KEY"
```
* JSON-Centric Pipelines: Easily structure and process data. Here's an example of performing sentiment analysis on IMDB movie reviews using a prebuilt skill: ```python
from flashlearn.utils import imdb_reviews_50k
from flashlearn.skills import GeneralSkill
from flashlearn.skills.toolkit import ClassifyReviewSentiment
# Load data and skills
data = imdb_reviews_50k(sample=100)
skill = GeneralSkill.load_skill(ClassifyReviewSentiment)
tasks = skill.create_tasks(data)
# Process tasks in parallel
results = skill.run_tasks_in_parallel(tasks)
# Save outputs as clean JSON
import json
with open('sentiment_results.jsonl', 'w') as f:
for task_id, output in results.items():
input_json = data[int(task_id)]
input_json['result'] = output
f.write(json.dumps(input_json) + '\n')
```
* Multi-Step Pipelines: Chain and extend workflows by passing structured outputs to subsequent skills. ```python
# Example of chaining tasks
# next_skill = ...
# next_tasks = next_skill.create_tasks([...based on 'output'...])
# next_results = next_skill.run_tasks_in_parallel(next_tasks)
```
* Custom Skills: For domain-specific needs, easily define custom skills: ```python
from flashlearn.skills.learn_skill import LearnSkill
learner = LearnSkill(model_name="gpt-4o-mini")
skill = learner.learn_skill(data, task='Define categories "satirical", "quirky", "absurd".')
tasks = skill.create_tasks(data)
```
* Image Classification: Handle visual data with ease using flexible tools for single and multi-label classification. ```python
from flashlearn.skills.classification import ClassificationSkill
images = [...] # Base64-encoded images
skill = ClassificationSkill(
model_name="gpt-4o-mini",
categories=["cat", "dog"],
max_labels=1,
system_prompt="Classify images."
)
tasks = skill.create_tasks(images, column_modalities={"image_base64": "image_base64"})
results = skill.run_tasks_in_parallel(tasks)
```