LLM Translate
The LLM Translate processor translates text from specified columns to the chosen language. The processor uses the Snowflake Large Language Model (LLM) Translate function to translate the data.
When you configure the processor, you specify the source columns to evaluate and the language of the text in that column. You can configure the processor to auto-detect the language in the column if you do not know the language or if a column contains more than one language.
Then, you specify the languages to translate to, and optional output columns.
Example
reviews
column, but they are in
multiple languages. You want the LLM Translate processor to translate the reviews to
English and overwrite the original reviews
column with the results.
To do this, you can configure the LLM Translate processor as follows: - Source Column:
reviews
- Source Language:
AUTO_DETECT
- Target Language:
English
- Output Column: not configured
productId | userId | reviews |
---|---|---|
B-634 | aabba6 | Esto es lo mejor! No sé cómo viví sin él durante tanto tiempo. |
FS-845 | 99louis | Cest un peu cher, mais ça marche très bien. |
S-212 | boo55 | non assomiglia per niente alla foto. |
productId | userId | reviews |
---|---|---|
B-634 | aabba6 | This is the best! I don't know how I lived without him for so long. |
FS-845 | 99louis | It's a bit expensive, but it works very well. |
S-212 | boo55 | does not look like the photo at all. |
Notice how the translations replace the original reviews. To keep the original reviews in addition to the translations, you can simply specify a new output column name in the Output Column property.
Source and Output Columns
- Source columns
- The processor evaluates data in the columns defined in the Source Column property.
- Output columns
- The processor writes translated data into the columns defined in the Output Column property. The processor
creates columns and overwrites data in output columns as
follows:
- When you define an output column that does not exist in incoming data, the processor creates the column.
- When you define an output column that exists, the processor overwrites the data in the column.
- When you do not define an output
column, the processor places the data in the column being
evaluated, overwriting the original data.
For example, if you configure the processor to translate a
Feedback
column and do not specify an output column, the processor places the translated text into theFeedback
column.
Configuring an LLM Translate Processor
Configure an LLM Translate processor to translate text in specified columns to another language.
-
On the General tab, configure the following
properties:
General Property Description Name Stage name. Description Optional description. Cache Data Caches processed data. -
On the Translate tab, configure the following
property:
Translate Property Description Translate Configurations Specify the following properties, as needed: -
Source Column - Name of the column to evaluate. To evaluate multiple columns, you can use a regular expression to define a name pattern to match.
- Source Language - Language in the specified source column. You can use AUTO_DETECT if you do not know the language or the column includes multiple languages.
- Target Language - Language to translate data to.
- Output Column - Output column for the generated summary. When not defined, the processor overwrites the associated source column. For information about defining multiple columns, see Source and Output Columns.
To specify additional columns to evaluate, click Add Another or Bulk Edit Mode.
-