Show HN: Never train another ML model again

4 points by galgia 5 months ago

Hello, Hacker News community!

I made FlashLearn, an open-source library designed to streamline the integration of Large Language Models (LLMs) into your workflows. With FlashLearn, you can effortlessly build JSON-based pipelines for tasks like classification and labeling using just a few lines of code, while maintaining standardized outputs for seamless downstream processing.

Key Features:

* Quick Setup: Install FlashLearn via PyPI:

  ```
  pip install flashlearn
  ```

  Additionally, set your LLM provider credentials if you're using OpenAI / Deepseek:

  ```
  export OPENAI_API_KEY="YOUR_API_KEY"
  ```

* JSON-Centric Pipelines: Easily structure and process data. Here's an example of performing sentiment analysis on IMDB movie reviews using a prebuilt skill:

  ```python
  from flashlearn.utils import imdb_reviews_50k
  from flashlearn.skills import GeneralSkill
  from flashlearn.skills.toolkit import ClassifyReviewSentiment

  # Load data and skills
  data = imdb_reviews_50k(sample=100)
  skill = GeneralSkill.load_skill(ClassifyReviewSentiment)
  tasks = skill.create_tasks(data)

  # Process tasks in parallel
  results = skill.run_tasks_in_parallel(tasks)

  # Save outputs as clean JSON
  import json
  with open('sentiment_results.jsonl', 'w') as f:
      for task_id, output in results.items():
          input_json = data[int(task_id)]
          input_json['result'] = output
          f.write(json.dumps(input_json) + '\n')
  ```

* Multi-Step Pipelines: Chain and extend workflows by passing structured outputs to subsequent skills.

  ```python
  # Example of chaining tasks
  # next_skill = ...
  # next_tasks = next_skill.create_tasks([...based on 'output'...])
  # next_results = next_skill.run_tasks_in_parallel(next_tasks)
  ```

* Custom Skills: For domain-specific needs, easily define custom skills:

  ```python
  from flashlearn.skills.learn_skill import LearnSkill

  learner = LearnSkill(model_name="gpt-4o-mini")
  skill = learner.learn_skill(data, task='Define categories "satirical", "quirky", "absurd".')
  tasks = skill.create_tasks(data)
  ```

* Image Classification: Handle visual data with ease using flexible tools for single and multi-label classification.

  ```python
  from flashlearn.skills.classification import ClassificationSkill

  images = [...]  # Base64-encoded images
  skill = ClassificationSkill(
      model_name="gpt-4o-mini",
      categories=["cat", "dog"],
      max_labels=1,
      system_prompt="Classify images."
  )
  tasks = skill.create_tasks(images, column_modalities={"image_base64": "image_base64"})
  results = skill.run_tasks_in_parallel(tasks)
  ```