LLM Sentiment

The LLM Sentiment processor generates a sentiment score based on English-language text in specified columns. The processor uses the Snowflake Large Language Model (LLM) Sentiment function to evaluate the sentiment of the text and provide a floating point score between -1 and 1 for evaluated data.

The sentiment score uses 1 for sentiments evaluated as the most positive, -1 for those considered the most negative, and values around 0 for neutral sentiments.

When you configure the processor, you specify the source columns to evaluate and optionally define output columns for the sentiment scores.

For more information about the Sentiment function and other Snowflake LLM functions, see the Snowflake documentation.

Note: At this time, Snowflake charges differently for LLM processing. See the Snowflake consumption rates for details.

Example

Say you want to generate a numeric sentiment score for user feedback defined in a Comments column for a product. To write the sentiment score to a new Comments-Sentiment column, you simply configure the processor as follows:

Source Column: Comments
Output Column: Comments_Sentiment

With the following incoming data:


ProductId	UserId	Comments
507	smartguy64	totally bogus
812	lo234go	works ok
1563	99ball00ns	Best cheese grater I ever bought!
2360	ac123	this things terrible. dont waste your time.
355	msally55	worked great for just over a year, then the handle cracked off and it doesn't work now

The processor passes the following output downstream:


ProductId	UserId	Comments	Comments_Sentiment
507	smartguy64	totally bogus	-0.8749435
812	lo234go	works ok	0.3357192
1563	99ball00ns	Best cheese grater I ever bought!	0.8874393
2360	ac123	this things terrible. dont waste your time.	-0.83438915
355	msally55	worked great for just over a year, then the handle cracked off and it doesn't work now	-0.4387149

Notice how the works ok comment receives only a slightly positive score in comparison to the glowing response in the following row.

Source and Output Columns

Note the following details about defining source and output columns:

Source columns

The processor evaluates data in the columns defined in the Source Column property.

To evaluate multiple columns, you can define multiple sets of configurations. You can also use regular expressions to have the processor evaluate all columns with matching names.

Output columns

The processor writes generated sentiment scores into the columns defined in the Output Column property. The processor creates columns and overwrites data in output columns as follows:

When you define an output column that does not exist in incoming data, the processor creates the column.
When you define an output column that exists, the processor overwrites the data in the column.
When you do not define an output column, the processor places the data in the column being evaluated, overwriting the original data.
For example, if you configure the processor to evaluate a Feedback column and do not specify an output column, the processor places the sentiment score in the Feedback column.

Note: When specifying the output column, you can use $0 to represent the evaluated source column name, and then add preceding or following characters.

For example, say you use a regular expression to define the source columns to evaluate. If you specify $0_sentiment as the name of the corresponding output columns, the processor writes the sentiment score for a feedback source column to a new feedback_sentiment column.

Configuring an LLM Sentiment Processor

Configure an LLM Sentiment processor to evaluate and generate a sentiment score for English-language text in specified columns.

On the General tab, configure the following properties:

General Property Description

Name Stage name.

Description Optional description.

Cache Data Caches processed data.

General Property	Description
Name	Stage name.
Description	Optional description.
Cache Data	Caches processed data.

On the Sentiment tab, configure the following property:


Sentiment Property	Description
Sentiment Configurations	Specify the following properties, as needed: Source Column - Name of the column to evaluate. To evaluate multiple columns, you can use a regular expression to define a name pattern to match. Output Column - Output column for the generated summary. When not defined, the processor overwrites the associated source column. For information about defining multiple columns, see Source and Output Columns. To specify additional columns to evaluate, click Add Another or Bulk Edit Mode.