sitemap.yaml

README.md:
  hash: 2e592f00c143c2aa89774674fd7a1dcd
  summary: Dria is a comprehensive synthetic data infrastructure offering a framework
    for creating, managing, and orchestrating synthetic data pipelines. It features
    a multi-agent network to generate data from web and siloed sources, making it
    ideal for AI projects, including those involving large language models. Dria provides
    massive parallelization, compute offloading, and flexible tools for custom pipelines,
    eliminating the need for personal GPU infrastructure. It leverages decentralization
    for state-of-the-art synthesized data, enabling web-based grounding and diverse
    model outputs. Key benefits include scalability, efficient data generation, and
    extensive built-in tools for accelerated AI development.
cookbook/eval.md:
  hash: e41b7d4ea9fa06e011ae7f207b79df8a
  summary: The guide on "Evaluating RAG Systems with Synthetic Data" provides a comprehensive
    approach to assessing Retrieval-Augmented Generation (RAG) systems using synthetic
    datasets. It emphasizes creating diverse question-answer pairs to test RAG pipelines
    and measure performance across different parameters such as embedding models,
    retrieval methods, and reranking strategies. The guide details the setup process,
    RAG class implementation using libraries like RAGatouille and instructor, and
    steps for generating synthetic data, including QA pairs and multi-hop questions.
    This approach enhances the ability to optimize AI applications by understanding
    their performance metrics using synthetic baseline data. Key topics include RAG,
    synthetic data, AI evaluation, and performance metrics.
cookbook/function_calling.md:
  hash: c48ac8817fe46dc9e897812fdf50c350
  summary: 'Explore effective strategies and best practices for function calling in
    programming, focusing on syntax, software development, and real-world applications.
    This guide covers essential techniques for optimizing function calls, enhancing
    code readability, and improving program efficiency. Key concepts include understanding
    different types of function calls, importance of well-structured syntax, and practical
    implementation tips for software engineers. Keywords: function calling, programming,
    software development, best practices, syntax.'
cookbook/nemotron_qa.md:
  hash: 9a89a7b09ea66ffb3df5c78465f1d0e5
  summary: 'This guide details the implementation of Nvidia''s Preference Data Pipeline
    for generating synthetic data using Dria and the Llama 3.1 model. The pipeline
    involves two main steps: Synthetic Response Generation, where Llama 3.1 generates
    questions and answers based on a given topic, and using a Reward Model as a Judge,
    leveraging Nemotron-4 for scoring responses. The setup includes defining a folder
    structure with specific Python tasks and prompts for each pipeline step (subtopic
    generation, question generation, and answer generation). The implementation focuses
    on the intricacies of building data flows, executing multiple tasks in parallel,
    and setting up asynchronous execution with Dria. This content covers key terms
    such as Nvidia, Llama 3.1, Dria, synthetic data, machine learning, Preference
    Data, Nemotron, and Pipeline Builder.'
cookbook/patient_dialogues.md:
  hash: 9a2c3a5a737be8ae458135ca4449d364
  summary: The content focuses on analyzing patient dialogues to gain insights into
    healthcare conversations, with the aim of improving patient experience in medical
    settings. It highlights important aspects such as medical communication, patient
    safety, and clinical interactions. The objectives include enhancing the quality
    of healthcare interactions and ensuring effective communication between patients
    and healthcare providers. Key points include understanding patient experience,
    optimizing healthcare dialogue, and prioritizing patient safety. Keywords for
    SEO include patient experience, healthcare dialogue, medical communication, patient
    safety, and clinical interactions.
cookbook/preference_data.md:
  hash: 29729481d57824e4de171a02377e768d
  summary: 'The document explores methods for generating synthetic preference data
    using Dria, an innovative tool that enhances data analysis in AI applications.
    It focuses on synthetic data generation techniques tailored for preference data,
    providing insights on improving AI model training and analysis. Key topics include
    the benefits of using synthetic data in AI, methods for creating realistic preference
    datasets, and the role of Dria in streamlining the data generation process. Keywords:
    synthetic data, preference data, data generation, AI, Dria.'
example_run.md:
  hash: 6631111125885290fb2f0796373de649
  summary: This content provides a Python example demonstrating asynchronous execution
    of multiple AI models using the Dria library, highlighting categories such as
    asynchronous programming, parallel execution, and the use of AI models within
    a workflow. The example script utilizes components like `Dria`, `MagPie`, and
    `ParallelSingletonExecutor` to run models such as `GPT4O_MINI`, `GEMINI_15_FLASH`,
    and `MIXTRAL_8_7B` in parallel. It shows how to load and run instructions asynchronously
    to improve efficiency in AI-related tasks. Key elements include keywords such
    as Python, Dria, AI models, and asynchronous programming.
factory/clair.md:
  hash: 1599d9db6b9b318aed25b960299c6188
  summary: Clair is a SingletonTemplate task designed to correct student coding solutions
    using reasoning to improve their coding skills. It leverages AI models for accurate
    code outputs, focusing on key areas like AI Correction, Student Solutions, and
    Coding Education. Clair processes inputs such as the task description and student's
    original solution to produce outputs including reasoning for corrections, an improved
    student solution, and details on the AI model used, such as GEMMA2_9B_FP16. This
    task is beneficial for enhancing coding education through task automation and
    machine learning, specifically highlighted in projects like Distilabel CLAIR and
    Anchored Preference Optimization.
factory/code_generation.md:
  hash: 74cb0b00676198f8a5bd2d8d0df5200f
  summary: The `GenerateCode` task, built on the `SingletonTemplate`, facilitates
    code generation based on specific instructions and targeted programming languages,
    leveraging coder models for precision. Optimized for coder models like `Model.CODER`
    and `Model.QWEN2_5_CODER_1_5B`, it translates instructions into executable code
    seamlessly. It requires inputs such as an instruction and a programming language,
    and outputs the original instruction, language, generated code, and the model
    used. This tool is ideal for software engineering tasks, supporting multiple languages
    through AI-driven models for enhanced programming and development. Key themes
    include code generation, singleton pattern, and AI-integrated software development.
factory/complexity_scorer.md:
  hash: 8e03fd72722882b679119fec61cce831
  summary: 'ScoreComplexity is a tool designed to rank instructions based on complexity
    using the GEMMA2_9B_FP16 AI model. This `Singleton` task is essential for evaluating
    task complexity and instruction ranking, catering to fields like AI model development
    and task assessment. The tool takes a list of instructions as input and outputs
    a string of complexity scores, offering insights into task evaluation. An example
    provided demonstrates how to implement the function in Python, highlighting the
    tool''s practical application. Keywords: complexity scoring, instruction ranking,
    AI models, GEMMA2, task evaluation.'
factory/csv_extender.md:
  hash: 6b49d7c50b48c9ef74edf308cf7cc44c
  summary: 'The `CSVExtenderPipeline` class is a Python tool designed to efficiently
    extend CSV data by automatically generating new rows with subcategories. This
    functionality is beneficial for workflows involving data extension, processing,
    and automation. The pipeline accepts CSV data in string format, along with parameters
    for the number of new values and rows to be generated, resulting in an expanded
    dataset. Users can leverage this tool in various applications such as file management,
    web browsing, communication, and scheduling tasks. Key features include scalability
    and automation, making it suitable for handling large datasets quickly and efficiently.
    Keywords: CSV extension, data processing, Python automation, data workflows, subcategories
    generation.'
factory/evaluate.md:
  hash: f940fcf91438c840c682874166209e0b
  summary: The document outlines the `EvaluatePrediction` task, a Singleton task in
    machine learning, designed to assess if a predicted answer is contextually and
    semantically correct when compared to the correct answer. Key components include
    inputs like the prediction, question, and context, and outputs such as the evaluation
    result and model used. The task employs semantic context analysis and prediction
    evaluation, leveraging machine learning models to ensure accuracy. An example
    using GPT-4O demonstrates how to evaluate a prediction against given context and
    question, returning an evaluation indicating correctness, and identifying the
    model used. Important keywords include prediction evaluation, machine learning,
    semantic context, and model assessment.
factory/evolve_complexity.md:
  hash: 18c08d0a557b5dceb22785feda61134b
  summary: EvolveComplexity is a unique Singleton task designed to enhance the complexity
    of instructions using advanced language models like GEMMA2. It takes a simple
    instruction input and outputs a more intricate version, utilizing AI for complexity
    generation. The process involves models such as GEMMA2_9B_FP16 to create detailed
    and sophisticated instructions. This method is highly relevant in the fields of
    AI instruction, language modeling, and complexity generation. Key terms include
    EvolveComplexity, AI Instruction, Language Model, and GEMMA2. For more information
    and resources, users can refer to WizardLM and related projects on GitHub.
factory/graph_builder.md:
  hash: a19455cd8616c25ba2b4a954764afc3a
  summary: GenerateGraph is a task in Dria that uses AI to create graphs representing
    concepts and their relationships from a given context. It extracts an ontology
    of terms and visualizes them as nodes and edges, providing insight into the connections
    between different concepts, specifically in fields like artificial intelligence,
    machine learning, and deep learning. The tool utilizes advanced models like GEMMA2_9B_FP16
    for generating these graphs. Key details include its ability to identify subfields
    within AI, such as machine learning and neural networks, which are essential for
    understanding complex patterns in data. This functionality supports AI-related
    tasks like graph generation, conceptual analysis, and visual representation of
    data relationships.
factory/instruction_backtranslation.md:
  hash: f9fdc15b60420201832687c44ba1e158
  summary: The Instruction Backtranslation is a Singleton task designed to evaluate
    the quality of AI-generated responses to given instructions by assigning a score
    from 1 to 5 and providing reasoning. It uses models like GPT-4 to determine accuracy
    and relevance, as showcased in the example where correct and incorrect math problem
    solutions are scored and reasoned. Implemented with ParallelSingletonExecutor,
    this process allows simultaneous evaluation across multiple models. This technique
    is useful for improving AI performance by ensuring responses are accurate, concise,
    and aligned with user instructions. Key insights include scoring strategies, real-time
    evaluation, and enhancing AI-generated content's reliability.
factory/instruction_evolution.md:
  hash: 8625bc23e56207de30d80843a2b3a2f1
  summary: EvolveInstruct is a tool designed to enhance the depth and relevance of
    prompts using advanced AI models. It applies various mutation types such as "FRESH_START,"
    "ADD_CONSTRAINTS," "DEEPEN," "CONCRETIZE," "INCREASE_REASONING," and "SWITCH_TOPIC"
    to alter original prompts into more informative versions. The tool integrates
    with models like `GEMMA2_9B_FP16` and provides outputs including the mutated prompt,
    the original prompt, and the model used. By improving prompt complexity and relevance,
    EvolveInstruct supports tasks in natural language processing with applications
    in AI model prompt evolution. Key features include prompt mutation, AI-driven
    enhancements, and detailed instruction-following capabilities.
factory/iterate_code.md:
  hash: 6869a29e4a266de9db0ce56407de10a4
  summary: IterateCode is a Singleton task designed to enhance existing code by following
    specific instructions, improving outputs, and incorporating better error handling.
    It takes inputs such as the original code, instructions for improvement, and the
    programming language, generating refined code using models like DEEPSEEK_CODER_6_7B.
    The process involves iterating over the code to apply enhancements, which is illustrated
    through an example of adding error handling to a simple Python function. Key areas
    of focus include code improvement, iteration, error handling, software engineering,
    and code generation.
factory/list_extender.md:
  hash: 7e1a228507c728c6b3bfafe2e17f4ad0
  summary: The "ListExtenderPipeline" class is a dynamic tool designed to enhance
    and extend lists by generating new subcategories from existing items. Aimed at
    optimizing workflows, it helps in list management and data processing by providing
    a structured pipeline to create expansive and granular lists. By leveraging key
    features such as granularization and customizable subcategory generation, it's
    ideal for data processing tasks that require detailed categorization. This tool
    is particularly useful for organizations needing comprehensive lists for various
    categories, such as Wildlife, Computers, Music, and more, all structured systematically
    for better management and analysis. Key phrases include list management, data
    processing, pipeline, subcategories, and dynamic lists.
factory/magpie.md:
  hash: 721abd4ef19d9d04679dc2d9fb3528de
  summary: MagPie is a specialized AI workflow that generates structured dialogues
    between two distinct personas using advanced AI models like GEMMA2_9B_FP16. It
    is designed for tasks involving AI dialogue generation, natural language processing,
    and synthetic conversations. Users can customize the number of dialogue turns
    and choose the personas, such as a "curious scientist" and an "AI assistant."
    MagPie outputs a dialogue list with each speaker's contribution and identifies
    the AI model used. The tool underscores responsible AI development by addressing
    bias in training data. It references methods for bias mitigation, such as data
    diversity, algorithmic adjustments, and ongoing evaluation processes. Explore
    AI models and conversational AI solutions for improved interaction generation.
factory/multihopqa.md:
  hash: 27a3f0f48a6b9f0aebe83f9e244e08d3
  summary: 'The "MultiHopQuestion" task is a cutting-edge AI solution designed to
    generate multi-hop questions from three input documents, facilitating efficient
    reasoning across multiple texts. This process involves creating questions that
    require different levels of reasoning: 1-hop from a single document, 2-hop across
    two documents, and 3-hop spanning all three documents. The task outputs include
    the generated questions, corresponding answers, and the model used, such as "mixtral:8x7b".
    This AI-driven approach aids in advanced document processing, data extraction,
    and AI reasoning, enhancing understanding through structured query generation.
    Key topics include applied AI, multi-hop questions, and question generation for
    comprehensive data analysis.'
factory/persona.md:
  hash: 39282e615c68300ad172cc91baf5b994
  summary: The PersonaPipeline class is designed for generating detailed personas
    with unique backstories and settings, specifically tailored for simulations in
    a cyberpunk-themed environment. It allows users to specify the number of personas
    to generate based on a given simulation description, such as a futuristic 2077
    cityscape. Key features include creating random variables that align with the
    simulation's context and generating backstories featuring distinct characters,
    like augmented mercenaries or entrepreneurial street vendors navigating a dystopian
    society. The process supports data generation, emphasizing persona creation for
    simulation, cyberpunk settings, and AI-driven storytelling. Keywords include synthetic
    data, persona generation, simulation, cyberpunk, and AI.
factory/qa.md:
  hash: 50d8cc903ea68ce53d91360229cfd638
  summary: 'The QAPipeline class is designed to create a robust pipeline for generating
    personas and simulating question-answer interactions, enhancing AI conversations.
    By processing simulation descriptions, it creates detailed personas with backstories
    and handles dynamic Q&A sessions, generating multiple questions and answers based
    on input text chunks. Key features include setting context through simulation
    descriptions and tailoring response tone and style with persona descriptions.
    It focuses on improving AI accuracy and versatility, leveraging frameworks like
    synthetic data generation and iterative training, crucial for AI researchers aiming
    to optimize language model evaluations and applications. Keywords: QAPipeline,
    AI personas, question-answering, simulation, persona generation, iterative training,
    synthetic data.'
factory/quality_evolution.md:
  hash: 946311204688fc3735d4c90f7ec14002
  summary: EvolveQuality is a Singleton task aimed at enhancing the quality of AI-generated
    responses to prompts by utilizing specified methods such as Helpfulness, Relevance,
    Deepening, Creativity, and Details. This process involves rewriting the original
    response to improve its quality using the GEMMA2 9B FP16 model. Important keywords
    include Applied AI, Response Enhancement, Natural Language Processing, and Prompt
    Engineering. The core objective is to refine AI outputs for better clarity and
    depth, making it essential for tasks requiring precise and detailed information
    in AI applications. The task is implemented through an asynchronous process, as
    demonstrated in the provided Python code example.
factory/search.md:
  hash: 9fd0730299ad1057eeeb4a9f9849e270
  summary: The document discusses the `SearchPipeline` class designed for efficient
    web data retrieval and summarization based on user-defined topics. It outlines
    the components of the SearchPipeline, including the `PageAggregator` for gathering
    web pages and the `PageSummarizer` for condensing information if summarization
    is enabled. An example is provided showcasing the setup for a search on "artificial
    intelligence" with summarization, which includes executing the pipeline and saving
    results in a JSON format. Key topics related to this document include topics like
    search automation, web scraping, data retrieval, AI pipelines, and summarization
    techniques.
factory/self_instruct.md:
  hash: 4ebd6959ffa2a22ccdc3a867622cabac
  summary: The document describes "SelfInstruct," a tool designed to generate diverse
    user queries for AI applications, particularly within professional task management
    settings. Utilizing the GEMMA2_9B_FP16 model, SelfInstruct creates user instructions
    based on criteria such as query diversity and relevance to specific contexts.
    Key features include input parameters like the number of queries, application
    description, and context, leading to outputs that provide structured user queries.
    This tool aids in enhancing AI-driven task management by generating relevant user
    interactions. Key SEO terms include AI, query generation, task management, user
    interaction, and Gemma.
factory/semantic_triplet.md:
  hash: 93b4440e8d615f1056b2efd1f091efd0
  summary: SemanticTriplet is a task designed to generate JSON objects containing
    three textual units, known as semantic triplets, with specified semantic similarity
    scores. It allows users to specify parameters such as the type of textual unit
    (e.g., sentence or paragraph), language, similarity scores, and the educational
    difficulty level. This task is particularly useful in natural language processing
    (NLP) applications focused on text similarity and can be used for educational
    tools. SemanticTriplet leverages models like GEMMA2_9B_FP16 and LLAMA3_2_3B to
    produce these textual units, making it a valuable tool for generating and analyzing
    semantic similarities in educational and linguistic contexts. Keywords include
    semantic triplet, NLP, text similarity, JSON, and educational tools.
factory/simple.md:
  hash: bd21b7d4b287af4ffe09d84ba1e870fe
  summary: The document provides an overview of the `Simple` task for text generation
    using AI models like `GEMMA2_9B_FP16`. It details how the task operates as a singleton
    to generate text from a given prompt, specifying inputs and expected outputs.
    A Python example demonstrates how to implement this task using the Dria library
    for asynchronous code execution. Core keywords include text generation, Python,
    GEMMA2, AI, and asyncio. The document aims to instruct on using AI models for
    generating text programmatically.
factory/subtopic.md:
  hash: 5367ea9fce447c09f412b5cd722742f9
  summary: The "SubTopicPipeline" is a Python class designed for generating hierarchical
    subtopics recursively from a main topic, using applied AI techniques. It allows
    users to specify a maximum depth for the subtopic tree, providing a structured
    approach to data generation with recursive functions. The pipeline is implemented
    using the Dria library, enabling the automatic generation of nuanced subtopics
    for topics, such as "Artificial Intelligence," up to the given depth. Key terms
    include subtopics, AI, recursive functions, data generation, and Python.
factory/text_classification.md:
  hash: 8d13fe522b8677dcd20d32280fbadeb5
  summary: The content provides a guide on implementing text classification using
    Python and the GEMMA2 model, focusing on outputting results in a JSON format.
    Key elements include defining input parameters such as task description, language,
    clarity, and difficulty, and generating JSON objects with 'input_text', 'label',
    and 'misleading_label'. An example script is provided, illustrating how to classify
    movie reviews as positive or negative using the `GEMMA2_9B_FP16` model. This guide
    is essential for anyone interested in applied AI, machine learning, and text classification
    using Python and JSON. Key topics include "text classification," "JSON," "Python,"
    "machine learning," and "GEMMA2."
factory/text_matching.md:
  hash: e6fcfc7d42eef9f49c13ade0137f2673
  summary: The content outlines the "TextMatching" task, which involves generating
    JSON examples for text matching in natural language processing (NLP). It utilizes
    the GEMMA2_9B_FP16 model to create a JSON object containing 'input' and 'positive_document'
    fields for a specified task description and language. Key details include the
    use of the Dria API for executing the task, the generation of text examples for
    applications like sentiment analysis, and the output of a structured JSON format.
    Important keywords include Text Matching, Natural Language Processing, JSON Generation,
    GEMMA2 Model, and AI Tasks. This guide is particularly useful for those looking
    to explore AI-driven text matching solutions.
factory/text_retrieval.md:
  hash: 25a24a34c6767278e2e50337c7085a3d
  summary: The `TextRetrieval` task facilitates the generation of JSON objects tailored
    for text retrieval applications, incorporating a user query, a positive document,
    and a hard negative document. It serves various scenarios in text retrieval tasks,
    enabling the creation of user queries that encompass different lengths, types,
    and clarity levels across languages and difficulty tiers. Key features include
    defining the task description, query type, and num_words to set the scope of retrieved
    text materials optimally. The tool supports platforms utilizing AI workflows,
    text retrieval, JSON generation, and NLP, enhancing data generation and retrieval
    precision through model specification, like the example with the `GEMMA2_9B_FP16`
    model. Notable references include works on text embeddings and other AI-driven
    text retrieval methods.
factory/validate.md:
  hash: 6caa8dfc9c8cf6bb3fdb0ece812a0ae3
  summary: The "ValidatePrediction" task, part of the Dria framework, is designed
    to evaluate the accuracy of predicted answers by comparing them to correct answers.
    This Singleton task uses AI models, such as the "GEMMA2_9B_FP16" or "QWEN2_5_32B_FP16",
    to determine if predictions are contextually and semantically correct, providing
    a boolean output for validation. Key aspects include predictive modeling, AI validation,
    and semantic analysis, making it essential for refining machine learning workflows.
    Examples demonstrate its use in Python to assess predictions, enhancing AI-driven
    decision-making and model accuracy.
factory/web_multi_choice.md:
  hash: 2545d270bf32b63ad69b20b423c1b161
  summary: WebMultiChoice is a Singleton AI task designed to answer multiple-choice
    questions through comprehensive web search and evaluation methods. It uses a workflow
    that includes generating a search query, selecting a URL, scraping content, and
    evaluating the most accurate answer based on gathered notes. Leveraging advanced
    deep learning models like QWEN, WebMultiChoice targets optimal accuracy by analyzing
    medical contexts and other subject areas. Key aspects include AI evaluation, deep
    learning-based search strategies, and an emphasis on using type II pneumocytes
    cells for specific medical inquiries. This AI-driven approach is crucial for tasks
    requiring precise multiple-choice question answers and web-based information retrieval.
how-to/batches.md:
  hash: 8b1e8ccdba2147a17b813429d2670357
  summary: 'This article provides a guide on executing multiple instructions concurrently
    using Batches and the ParallelSingletonExecutor in Python with Dria. It highlights
    the steps to create a Dria client, a Singleton task, and a ParallelSingletonExecutor
    object to facilitate parallel execution, optimizing workflows with large sets
    of instructions. Key features include loading instructions with specific prompts
    and using different models like QWEN2_5_7B_FP16, LLAMA3_2_3B, and LLAMA3_2_1B.
    The example code demonstrates setting up the asynchronous execution of tasks using
    `asyncio`, making it ideal for users interested in improving parallel execution
    efficiency in Python. Keywords include: Batches, parallel execution, asyncio,
    Dria, Python, ParallelSingletonExecutor.'
how-to/formatting.md:
  hash: 142b6c2ff79493ced5b5c682a63c8e62
  summary: 'The `Formatter` class is a crucial tool for transforming datasets into
    training-ready formats compatible with specific trainers. It supports various
    format types such as Standard and Conversational, each with subtypes like LANGUAGE_MODELING,
    PROMPT_ONLY, PROMPT_COMPLETION, PREFERENCE, and UNPAIRED_PREFERENCE. An example
    usage is converting data from the `InstructionBacktranslation` into the `STANDARD_UNPAIRED_PREFERENCE`
    format, enhancing its integration with HuggingFace''s TRL framework. This functionality
    facilitates seamless training of transformer language models using Reinforcement
    Learning, covering steps from supervised fine-tuning to complex policy optimizations.
    Key trainers in the TRL framework expect different data formats, ensuring that
    generated data fits specific trainer requirements, thus optimizing machine learning
    workflows. Keywords: Formatter, dataset transformation, training-ready formats,
    HuggingFace TRL, Reinforcement Learning, transformer models, supervised fine-tuning.'
how-to/functions.md:
  hash: 7787b28db7f12d6b85890616959bb642
  summary: 'The document provides an overview of Dria''s workflow automation tools,
    focusing on built-in and custom functions that facilitate automation using Python.
    Key components include `CustomTool` and `HttpRequestTool`, which allow users to
    create specialized operations and make HTTP requests within workflows. Essential
    details include the implementation of these tools in workflows through classes
    that inherit from `CustomTool` or `HttpRequestTool` and the use of Python''s `pydantic`
    for model definition. Example workflows demonstrate tasks like summing integers
    and fetching cryptocurrency prices from APIs. Keywords: Dria, workflow automation,
    Python, custom functions, HTTP requests, CustomTool, HttpRequestTool.'
how-to/models.md:
  hash: 7a3457a0c573af10a80c421d3f5c5c42
  summary: Explore the wide range of AI models available in the Dria Network, featuring
    offerings from major developers such as Nous, Microsoft, Google, Meta, Alibaba,
    DeepSeek, Mistral, and OpenAI. Key models include Nous's Hermes-2-Theta, Microsoft's
    Phi3 and Phi3.5, Google's Gemma2, and Meta's Llama3.1 and Llama3.2 series. The
    network also features OpenAI's latest GPT-4 and GPT-4o models, as well as offerings
    from other innovators like Alibaba's Qwen and DeepSeek's coding models. These
    models vary in size, quantization, and application, providing a broad selection
    for different AI and machine learning needs.
how-to/pipelines.md:
  hash: d9a27429a97acd789e76e8aa5d7849f3
  summary: This guide explains how to create asynchronous pipelines for data processing,
    focusing on combining multiple workflows to efficiently generate complex outputs.
    It highlights the use of asynchronous processing to execute multiple instructions
    in parallel using a sequence of workflows, as illustrated with a Question Answer
    (QA) pipeline example. Key components include the use of `Dria`, `PipelineBuilder`,
    and `StepTemplate` to create, execute, and connect pipeline steps with callbacks
    like `scatter`. The article also provides a detailed implementation procedure
    for steps within a pipeline, showcasing how to handle input-output mapping using
    built-in and custom callbacks. Essential keywords include pipelines, workflows,
    asynchronous processing, data generation, QA pairs, and Dria.
how-to/selecting_models.md:
  hash: c79640401cff619b12b744c41640688a
  summary: The Dria Network is a robust infrastructure that facilitates efficient
    task execution through a network of Large Language Models (LLMs) using a MoA (Mixture-of-Agents)
    system. Users can select from a variety of models for their tasks by utilizing
    the `Model` enum in Dria's SDK, which enables the assignment of specific models
    to tasks. The network supports asynchronous task execution, distributing tasks
    to available nodes running the chosen model. If a model is unavailable, the system
    will queue the task until it becomes available. Users can also publish tasks to
    multiple models simultaneously to compare outputs. Available model providers include
    OLLAMA, OPENAI, GEMINI, and CODER. Key topics include model selection, LLMs, task
    execution, and asynchronous processing.
how-to/singletons.md:
  hash: e5c361bb9b3ca26af0c938d57cc36ad3
  summary: This guide explains the concept and implementation of singletons in programming,
    with a focus on the Dria SDK. Singletons are pre-built tasks designed for single-instance
    use to perform specific functions efficiently without the need for custom code.
    The tutorial details how to import and utilize singletons, as well as how to create
    custom singletons using the `SingletonTemplate` class. Key points include the
    use of the `workflow` and `parse_result` methods, a suggested folder structure
    for custom singletons, and an example of creating a singleton that reverses a
    string. Keywords include singletons, Dria SDK, custom singletons, and software
    design.
how-to/structured_outputs.md:
  hash: 597539d237ba282f9d2a9861d1fa1b1b
  summary: The article provides a comprehensive guide on generating structured outputs
    for book reviews using the Dria SDK in Python, leveraging JSON Schema to ensure
    output consistency. It outlines a process for defining a schema for book review
    components such as title, rating, genre, review text, and recommendation status.
    The guide utilizes the `WorkflowBuilder` in the Dria SDK to set parameters and
    workflows and employs the `SingletonTemplate` to parse results effectively, ensuring
    that responses adhere to specified formats. Key features discussed include structured
    outputs, Dria SDK, JSON Schema, and function calling capabilities, emphasizing
    their importance for models like OpenAI and others that support structured feedback.
    The tutorial includes full code to facilitate easy implementation and testing.
how-to/tasks.md:
  hash: 06933d21492b87e489dbf09f3ed0f310
  summary: 'The Dria network uses tasks as fundamental units of work executed by nodes.
    These tasks consist of workflows and models, and are processed asynchronously,
    allowing for scalable operations across the network. Key features include model
    selection, asynchronous execution, result retrieval, and scalability, making the
    system flexible and efficient. Tasks are created, published to the network, executed
    by available nodes, with results retrieved and completion marked once results
    are obtained. This structure facilitates efficient distribution of work using
    the Dria system. Keywords: Dria, tasks, asynchronous processing, model selection,
    workflow, scalability.'
how-to/workflows.md:
  hash: d4c2140be22cad00e3c88e4f59965c90
  summary: 'This article explores custom workflows within the Dria Network using the
    `dria_workflows` package to enhance task management with Large Language Models
    (LLMs) in Python. Key components of a workflow include configuration settings,
    steps for executing tasks, flow for managing order and conditional logic, and
    memory operations for inter-step data transfer. The guide details how to create
    workflows with steps like `generative_step` and `search_step`, manage memory operations
    for inputs and outputs, and define execution flows and conditions. Example workflows,
    such as generating and validating random variables, illustrate these concepts.
    Key terms: Dria Network, custom workflows, LLM, Python, task management, memory
    operations, execution flow.'
installation.md:
  hash: d88b12956c32d7ac53a168306d462021
  summary: This quickstart guide provides detailed instructions for installing the
    Dria SDK, obtaining an RPC token, and setting up the environment for interaction
    with the Dria Network. Compatible with Python 3.10 or higher, the guide includes
    steps to resolve potential installation issues with dependencies like coincurve,
    and covers the requirements for both the Community and Pro Networks. Key features
    include exporting the RPC token as an environment variable and notes on the network's
    current alpha stage and cost-free access. Users can also contribute by running
    a network node. For assistance with installation issues, resources like GCC-related
    solution steps and a Discord support community are available. Key terms include
    Dria SDK, RPC Token, Python, Machine Learning, and Installation Guide.
modules/structrag.md:
  hash: a81a6d05962d54cb64ffc4f6febb37f9
  summary: StructRAG is an advanced retrieval-augmented generation (RAG) framework
    that enhances large language models (LLMs) for knowledge-intensive reasoning tasks.
    It tackles the challenge of scattered and noisy information by employing cognitive-inspired
    techniques to identify the optimal structure for a given task, restructuring documents,
    and conducting inference for improved accuracy and reasoning. The framework, utilizing
    modules like StructRAGSynthesize, StructRAGSimulate, and StructRAGJudge, efficiently
    transforms raw information into structured knowledge, simulates responses, and
    evaluates solution accuracy, achieving state-of-the-art results across complex
    tasks. The framework and pre-trained models are available on Hugging Face for
    integration and experimentation. Key terms include StructRAG, LLMs, knowledge
    reasoning, cognitive techniques, and data structuring.
modules/structrag2.md:
  hash: 3a8cb0b0069f168e8554a1c8f54429da
  summary: StructRAG is an innovative approach that enhances knowledge-intensive reasoning
    in large language models (LLMs) by utilizing hybrid information structuring. This
    method leverages a Hybrid Router to determine the optimal format for structuring
    information, enabling more efficient and effective reasoning capabilities in artificial
    intelligence. The process is demonstrated using Python, incorporating the StructRAGGraph,
    StructRAGCatalogue, StructRAGAlgorithm, and StructRAGTable from the DRIA framework.
    This approach aims to advance knowledge restructuring in AI, especially when handling
    complex tasks like writing research papers or developing machine learning algorithms.
    Key terms include StructRAG, knowledge restructuring, LLM, hybrid information
    structuring, and AI reasoning. For more details, refer to the [StructRAG research
    paper](https://arxiv.org/abs/2410.08815).
node.md:
  hash: ae61ebe19c109c00c8ec164b930653e0
  summary: 'This guide provides a quick setup for running a node on Dria, a decentralized
    network for AI collaboration developed by FirstBatch. It outlines steps to get
    started without needing wallet activity, such as downloading the launcher from
    the Dria website, running it with your ETH wallet private key, and selecting a
    model to serve. The setup is designed to be completed in minutes with optional
    API integrations. Important notes include compatibility tips for MacOS users and
    post-setup actions like completing a form for a Discord role. Keywords: Decentralized
    Network, AI Collaboration, Dria Network, Node Setup, FirstBatch.'
quickstart.md:
  hash: 9b4daf510d23242bbfdcc65d7bef775a
  summary: "This quick start guide demonstrates how to use the Dria SDK to create\
    \ a dialogue between a math teacher and a student using the `MagPie` task. It\
    \ involves setting up a `Dria` instance with the necessary modules and executing\
    \ a task with predefined personas\u2014 a curious math student and a grumpy math\
    \ professor assistant. The guide provides a script to generate a dialogue with\
    \ multiple interaction turns, leveraging models like GPT-4O Mini. Ideal for those\
    \ interested in AI-driven dialogue simulations, this guide also encourages exploring\
    \ Dria's custom pipelines and tasks for more advanced applications. Key keywords\
    \ include Dria SDK, MagPie task, dialogue simulation, AI models, and custom pipelines."