LLM Extract Answer

The LLM Extract Answer processor extracts answers to a question from English-language text in specified columns.

The processor uses the Snowflake Large Language Model (LLM) Extract Answer function to extract answers and to generate a floating-point confidence score for each answer.

The processor includes both the answer and confidence score in an array written to the specified output columns. The confidence score ranges between 0 and 1, using numbers closer to 1 for high confidence and numbers closer to 0 for low confidence.

When you configure the LLM Extract Answer processor, you define the source columns to evaluate and the question to ask. You can optionally define the output columns to use.

For more information about the Snowflake Extract Answer function and other Snowflake LLM functions, see the Snowflake documentation.

Note: At this time, Snowflake charges differently for LLM processing. See the Snowflake consumption rates for details.

Example

Say you want to extract an ingredients list from recipes in a recipe column, and to write the results to an ingredients column. To do this, you might configure the LLM Extract Answer processor as follows:

Source Column: recipe
Question: What are the ingredients in this recipe?
Output Column: ingredients

Say the processor receives the following data:


title	recipe
vanilla cake	Cream the butter, oil, and sugar in the bowl of a stand mixer. Add the eggs one at a time, beating well after each addition. Then, add your vanilla and stir to combine. Combine your dry ingredients in a separate bowl, then add about ⅓ of the mixture into your bowl. Use a spatula to gently stir until just combined. Follow this with about ½ of your buttermilk, and stir again until just combined. Add ½ of the remaining dry ingredients stir, and then add the remainder of the buttermilk. Finish with the final portion of dry ingredients and use your spatula to make sure the batter is smooth.
hollandaise	This traditional eggs benedict sauce is called hollandaise sauce. The ingredients are butter, egg yolks, lime juice, heavy cream, and salt and pepper. This is a more traditional method for making hollandaise sauce. Some people prefer to make hollandaise sauce in the blender, which would work well for this recipe. To make hollandaise sauce, start by melting butter in a saucepan. Meanwhile, beat egg yolks in separate bowl and add lime juice, heavy cream, and salt and pepper. Once the butter has melted, you’re ready to temper the eggs by adding a small amount of the hot butter to the egg mixture. Stir it well and repeat this process, slowly adding one spoonful of hot butter to the egg mixture. We do this to avoid curdling the eggs. Finally, add the mixture back to the saucepan and cook it for a few more seconds.

After processing the data, the processor passes the following data downstream:


title	recipe	ingredients
vanilla cake	Cream the butter, oil, and sugar in the bowl of a stand mixer. Add the eggs one at a time, beating well after each addition. Then, add your vanilla and stir to combine. Combine your dry ingredients in a separate bowl, then add about ⅓ of the mixture into your bowl. Use a spatula to gently stir until just combined. Follow this with about ½ of your buttermilk, and stir again until just combined. Add ½ of the remaining dry ingredients stir, and then add the remainder of the buttermilk. Finish with the final portion of dry ingredients and use your spatula to make sure the batter is smooth.	[ { "answer": "butter, oil, and sugar", "score": 0.05580167 } ]
hollandaise sauce	This traditional Eggs Benedict sauce is called hollandaise sauce. The ingredients are butter, egg yolks, lime juice, heavy cream, and salt and pepper. This is a more traditional method for making hollandaise sauce. Some people prefer to make hollandaise sauce in the blender, which would work well for this recipe. To make hollandaise sauce, start by melting butter in a saucepan. Meanwhile, beat egg yolks in separate bowl and add lime juice, heavy cream, and salt and pepper. Once the butter has melted, you’re ready to temper the eggs by adding a small amount of the hot butter to the egg mixture. Stir it well and repeat this process, slowly adding one spoonful of hot butter to the egg mixture. We do this to avoid curdling the eggs. Finally, add the mixture back to the saucepan and cook it for a few more seconds.	[ { "answer": "butter, egg yolks, lime juice, heavy cream, and salt and pepper", "score": 0.9479347 } ]

Notice that while the LLM Extract Answer processor was unable to identify all of the ingredients in the vanilla cake recipe in the first row, it also included a low confidence score for the answer. However, the processor does extract all of the ingredients for the hollandaise sauce and provides a high confidence score for that result.

Source and Output Columns

Note the following details about defining source and output columns:

Source columns

The processor evaluates data in the columns defined in the Source Column property.

The column must contain English-language text in string format or in a JSON object.

To evaluate multiple columns, you can define multiple sets of configurations. You can also use regular expressions to have the processor evaluate all columns with matching names.

Output columns

The processor writes extracted answers and generated confidence scores as an array into the columns defined in the Output Column property. The processor creates columns and overwrites data in output columns as follows:

When you define an output column that does not exist in incoming data, the processor creates the column.
When you define an output column that exists, the processor overwrites the data in the column.
When you do not define an output column, the processor places the data in the column being evaluated, overwriting the original data.

Note: When specifying the output column, you can use $0 to represent the evaluated source column name, and then add preceding or following characters.

For example, say you use a regular expression to define the source columns to evaluate. If you specify $0_problem as the name of the corresponding output columns, the processor writes the answer for a feedback source column to a new feedback_problem column.

Configuring an LLM Extract Answer Processor

Configure an LLM Extract Answer processor to extract an answer to a question from English-language text in specified columns.

On the General tab, configure the following properties:

General Property Description

Name Stage name.

Description Optional description.

Cache Data Caches processed data.

General Property	Description
Name	Stage name.
Description	Optional description.
Cache Data	Caches processed data.

On the Extract Answer tab, configure the following property:


Extract Answer Property	Description
Extract Answer Configurations	Specify the following properties, as needed: Source Column - Name of the column that contains the data to extract the answer from. Specify a column with English-language text in string format or in a JSON object. You can use a regular expression to define a name pattern to match. Question - Question to answer. Output Column - Output column for the extracted answer and confidence score. When not defined, the processor overwrites the associated source column. For more information, see Source and Output Columns. To specify additional columns to evaluate, click Add Another or Bulk Edit Mode.