RLHF Prompt Studio
  • 19 Nov 2024
  • Dark
    Light
  • PDF

RLHF Prompt Studio

  • Dark
    Light
  • PDF

Article summary

Overview

As generative AI grows in popularity and rapidly evolves, so does the need to fine-tune models for specific commercial needs, training them on proprietary data to extend their base capabilities.

Dataloop’s RLHF studio (Reinforcement Learning Human Feedback) enables prompt engineering, allowing annotators to provide their feedback over responses (machine learning (ML) model-generated responses) to prompts. Both prompts and responses can be any type of item, for example, text or an image, and soon we will also support video and audio.

The purpose of the RLHF Studio is to enable organizations to fine-tune their generative AI models. The studio supports multiple prompts and responses, organized in a sequential chat flow, where annotators can provide feedback at any stage, rank the best response, and offer necessary feedback to enhance the model.

Once you open the JSON file, the RLHF Studio is displayed, and it has two sections:

For more information on the JSON format, see the RLHF JSON document.

Who Can Manage the RLHF Studio?

Project owners or Annotation managers can perform the following actions in the RLHF Studio:

RolesProject OwnerAnnotation MangerAnnotator
Verify the prompts and their responses.
Rank the responses.
Edit prompt's response texts.**
Provide answers to the questions in the feedback section.
Add comments
Send the answers for review.
Upload prompts data
Set feedback questions
Approve feedback questions
Report an issue
Enable Recipe Option to Edit the Responses

** To allow annotators to edit responses, you must enable the following option in the recipe associated with the prompt: Allow editing of a prompt response in the RLHF Studio.


Conversation

This section contains prompts and responses. This section displays prompts and their corresponding responses generated by the ML models.

  • Multiple response options for each prompt (e.g., Response A, Response B, etc.), each representing outputs from the different models.

  • Response is displayed in its own box, allowing users to compare them side-by-side.

  • Edit the response content directly by clicking on the Expand icon and edit the text, enabling quick adjustments to the model’s output without needing to re-run the model.

  • Responses can be ranked by accuracy or preference.

  • The best response can be selected by clicking the Crown icon.

  • The Text viewer for both Prompts and Responses supports Markdown formatting in addition to standard text. To create well-structured content with elements like headings, bold, italics, and other formatting options, use Markdown format. For example,

    • # Heading 1 -> Heading 1
    • ## Heading 2 -> Heading 2
    • **bold** -> bold

Prompts

A prompt is an instruction, question, stimulus, or cue given to the model to provide responses. Dataloop supports more than one prompt for generating model responses. The project owners or managers upload the prompts' data to the dataset.

How to Set a Prompt?

Use the JSON structure available here to create one and upload it to the Dataloop via SDK.

Response

A response is the action or answer that occurs as a result of a prompt. You can either use the prompt's studio itself to create the response by clicking on the plus icon on the prompt's section, or use the SDK to upload the responses created by ML Models. Dataloop supports more than two model responses; hence, a prompt might have multiple responses that were generated by different model versions or completely different models.

How to Add a Response to the Prompt?

  1. Open the JSON file in the Prompt studio.
  2. Click the Plus icon to display the Response dialog.
  3. Enter your response text or a URL for the prompt. If entering a URL, enable the Stream option.
  4. Click Save. The new response will be created and added below the prompt.

How Can I Generate Responses Using Dataloop's Models?

To generate responses using Dataloop's models, follow these steps:

  1. Access the Marketplace:
    1. Navigate to the Marketplace within the Dataloop platform.
    2. Go to the Dataloop Hub and select the Pipelines tab.
  2. Install the RLHF Pipeline:
    1. Locate the RLHF pipeline and click Install.
    2. Once installed, the pipeline will appear in your workspace.
  3. Configure the Dataset Node:
    1. Click on the Dataset node to open its configuration settings.
    2. Optionally, rename the node for clarity.
    3. From the dataset list, select the appropriate dataset to use.
  4. Set Up Predict Nodes:
    1. For each Predict node:
      1. Open its configuration settings.
      2. Optionally, rename the node for better identification.
      3. Choose the desired model from the available list for each node.
  5. Create a Labeling Task:
    1. Select the Workflow Labeling node.
    2. Set up a labeling task to either rank the responses or identify the best one.
  6. Start the Pipeline:
    1. After completing the configurations, click Start Pipeline.
    2. The models will process the prompts and generate responses, which will then be available for annotators to review and rank.

How to Edit Response's Text Content?

  1. Open the JSON file in the Prompt studio.
  2. Click the Expand icon in the Response section.
  3. Edit the content as needed.
  4. Click Back. The changes will be updated.

How to Rank the Model's Responses?

Dataloop RHLF Studio allows the annotator to rank the model responses.

  1. Open the JSON file in the Prompt studio.
  2. On the Response section, click on the dropdown and select a response rank (0 to 3) from the list.

How to Identify the Best Response?

Dataloop allows you to select the best responses at the prompt level. For example, if a prompt has responses A, B, and C generated by different models, the annotator can provide information on which response is the best in the Conversation section.

  1. Open the JSON file in the Prompt studio.
  2. Identify the best response in the Response section.
  3. Click on the Crown icon, and click the Save icon to save the update.

How to Delete a Response?

  1. Open the JSON file in the Prompt studio.
  2. Identify the response to be deleted.
  3. Click Delete icon. The response will be deleted.

Feedback

This feature allows annotators to answer questions and provide feedback on model-generated responses. The questions in the feedback section are defined and customized by the project owner or manager via the path: Ontology > Your Recipe > Labels & Attributes tab > Create Section. The list of responses appears on the right side under the Feedback section based on the selection in Conversation > Response.

How to Set Feedback Questions?

To customize the questions displayed in the feedback section:

  1. The project owner or developer can go to Ontology > Your Recipe > Labels & Attributes tab.
  2. Select Create Section to define or edit the feedback questions by adjusting the recipe’s attributes.

Dataloop supports various question types for feedback, including scales, multiple-choice, yes/no, and open-ended questions, etc.

How to Respond to Feedback Questions?

In the Feedback section of the RLHF Studio, annotators can respond to various question types, including scale ratings, multiple choice, yes/no, and open-ended questions. Here’s a guide to using feedback features effectively:

  • Provide Comments: Use the comment (icon) feature to elaborate on your feedback. Once you’ve completed your answer, click the Save icon to submit your response.
  • Report an Issue: If you identify a problem in the feedback provided, use the Open Issue icon to flag it. This helps annotators to track and address issues efficiently.
  • Mark Feedback for Review: Use the For Review option to mark responses by annotators that had issues during the QA process. This makes it easier for a QA reviewer to revisit flagged feedback for further evaluation.
  • Approve Feedback: Use the Approve icon to confirm that the feedback answers meet the required standards and are ready for final approval.

RLHF Studio Keyboard Shortcuts

General Shortcuts

ActionKeyboard Shortcuts
SaveS
DeleteDelete
UndoCtrl + Z
RedoCtrl + Y
Zoom In/OutScroll
Change BrightnessVertical Arrow + M
Change ContrastVertical Arrow + R
PanCtrl + Drag
Search LabelShift + L
Navigate in label pickerUp and Down arrows
Select label in label pickerEnter
Tool Selection0-9 (1-6 for Segmentation Studio)
Move selected annotationsShift + Arrow Keys
Previous ItemLeft Arrow
Next ItemRight Arrow
Add Item DescriptionT
Mark Item as DoneShift + F
Mark Item as DiscardedShift + G
Enable Cross Grid Tool HelperAlt + G
Hold G to show Cross Grid MeasurementsG
Hide/Show Selected AnnotationH
Hide/Show All AnnotationJ
Show Unmasked PixelsCtrl + M
Hide/Show Annotation ControllersC
Set Object ID menuO
Toggle pixel measurementP
Use tool creation modeHold Shift
Copy annotations from previous itemShift + V