Pipeline Step Summary
Pipeline steps
The following pipeline steps are supported in Tag.
Some steps require an external XProc engine (like the embedded Morgana engine), while other steps only work in Tag (i.e., tag:* extension steps).
- Container
- Standard
- p:add-attribute
- p:add-xml-base
- p:archive
- p:archive-manifest
- p:cast-content-type
- p:compare
- p:compress
- p:count
- p:delete
- p:error
- p:filter
- p:hash
- p:http-request
- p:identity
- p:insert
- p:json-join
- p:json-merge
- p:label-elements
- p:load
- p:make-absolute-uris
- p:namespace-delete
- p:namespace-rename
- p:pack
- p:rename
- p:replace
- p:set-attributes
- p:set-properties
- p:sink
- p:split-sequence
- p:store
- p:string-replace
- p:text-count
- p:text-head
- p:text-join
- p:text-replace
- p:text-sort
- p:text-tail
- p:unarchive
- p:uncompress
- p:unwrap
- p:uuid
- p:wrap
- p:wrap-sequence
- p:www-form-urldecode
- p:www-form-urlencode
- p:xinclude
- p:xquery
- p:xslt
- File
- Validation
- Tag extension steps
Private Tag extension steps - not available in general release yet (contact us to find out more).
- Optical character recognition
- Entity detection within formatted or plain text
- Text analysis to detect sentiment (the mood of a document), key phrases, language and syntax (e.g., find nouns and their related verbs)
- Translation between a wide range of languages
- Topic modeling – scan a collection of documents and establish common themes or subjects
- Text-to-speech
- Speech-to-text which can be used to transcribe call center calls or other voice recordings into text
p:if
This step provides a guard for the child steps it contains. If the test attribute expression returns true, all child steps will be run. If it returns false, nothing happens.
Attributes
- test - requires an expression that returns true or false
- collection - if true, the (XPath) default collection will contain all documents passed to this step, and the context item will be undefined (default is false)
More information on p:if can be found in the XProc standard.
p:choose
This step makes a choice between multiple outcomes. Each outcome is defined by a p:when child step, which stores a test attribute expression. A p:choose may also store an p:otherwise child step.
When this step runs, each p:when is tested in sequence. The first one with a test that returns true wins and is the only outcome to run. If none of the whens return true, the p:otherwise step will run if it exists.
Note that p:when steps are very similar to p:if steps. In particular, they have a collection attribute that works the same way.
More information on p:choose can be found in the XProc standard.
p:for-each
This step stores a list of child steps that may be run zero or more times. It provides a looping mechanism for all documents passed to it.
When this step runs, it runs all child steps by passing in only 1 document at a time. The output from this step contains the results from all runs arranged into 1 sequence (using output ports defined by its last child step).
When the child steps are run, a current input port is automatically created to store the single document passed to the child steps for that run. That document is also passed in to the first child step as the default readable port.
An alternative to passing documents to this step is to use a p:with-input instruction which can load external documents, pipe them from other steps, or define inline documents.
More information on p:for-each can be found in the XProc standard.
p:group
This step is a convenience wrapper for its child steps. It runs as a subpipeline in the same way that a pipeline does.
More information on p:group can be found in the XProc standard.
p:viewport
This step works on a single XML or HTML input document, and can process multiple chunks (subtrees) of it in sequence.
It uses a match attribute pattern to select a list of nodes. Each node is wrapped in a document (if necessary) and passed to the child steps one at a time. This temporary document is also made available using the named current input port.
The output from this step is a sequence of documents (one for each matched node). Each one is a copy of the input document, where the matched node is replaced by the result of running the child steps for that node. In this way multiple "views" of the input document are provided.
Attributes
- match - requires an XSLT selection pattern
This step requires an external XProc engine like Morgana.
More information on p:viewport can be found in the XProc standard.
p:add-attribute
This step adds a single attribute to a set of matching elements. The match option selects zero or more elements to modify.
Inputs
- source - accepts: xml html
Outputs
- result - produces: xml html
Options
- attribute-name - requires a name for the attribute
- attribute-value - requires a value for the attribute
- match - an XSLT selection pattern (default is '/*')
Note that p:set-attributes can be used to set multiple attributes at once.
More information on p:add-attribute can be found in the XProc standard.
p:add-xml-base
This steps changes the xml:base attribute used by expressions to resolve relative URLs. This lets you point at a folder of images and then refer to them in pipeline steps using local files names.
Inputs
- source - accepts: xml html
Outputs
- result - produces: xml html
Options
- all - if false only update children of document node, otherwise remove xml:base on all descendants
- relative - if true use a URI relative to the inherited base URI, otherwise use a full URI
This step requires an external XProc engine like Morgana.
More information on p:add-xml-base can be found in the XProc standard.
p:archive
Creates a ZIP file archive of all documents passed to it, or of all files specified by the manifest input port. The report output port stores a manifest report of the created ZIP file. Updates of existing ZIP files are possible using the archive input port.
Inputs
- source - accepts: any
- manifest - accepts: xml
- archive - accepts: any
Outputs
- result - produces: any
- report - produces: application/xml
Options
- format - output file format (default is ZIP)
- parameters - optional parameters (ref: Morgana)
- relative-to - can be used when creating a manifest
This step requires an external XProc engine like Morgana.
More information on p:archive can be found in the XProc standard.
p:archive-manifest
Creates an archive manifest for the ZIP file passed to it.
Inputs
- source - accepts: any
Outputs
- result - produces: application/xml
Options
- format - input file format (default is ZIP)
- override-content-types - used to partially override the content-type mechanism
- parameters - optional parameters (ref: Morgana)
- relative-to - can be used when creating the manifest
This step requires an external XProc engine like Morgana.
More information on p:archive-manifest can be found in the XProc standard.
p:cast-content-type
Creates a new document by changing the media type of the document passed to it. This converts files between XML, HTML, JSON and text according to specific rules.
Inputs
- source - accepts: any
Outputs
- result - produces: any
Options
- content-type - requires the content type to create
- parameters - optional parameters (not used in Tag)
The accepted content types include: application/xml, text/html, application/json and text/plain.
More information on p:cast-content-type can be found in the XProc standard.
p:compare
Compares two single documents for equality. A c:result document is produced as output containing "true" or "false". If differences are found, a summary of them may appear on the differences port.
Inputs
- source - accepts: any
- alternate - accepts: any
Outputs
- result - produces: application/xml
- differences - produces: any
Options
- fail-if-not-equal - fs true, report error instead of creating result document
- method comparison method (default is "deep-equal")
- parameters - optional parameters (ref: Morgana)
More information on p:compare can be found in the XProc standard.
p:compress
Compresses a single document using the GZIP format.
Inputs
- source - accepts: any
Outputs
- result - produces: any
Options
- format output file format (default is GZIP)
- parameters - optional parameters (ref: Morgana)
- serialization - optional serialization property (ref: Morgana)
This step requires an external XProc engine like Morgana.
More information on p:compress can be found in the XProc standard.
p:count
Counts the number of documents passed to it. Stores a single c:result element containing the count on the result output port.
Inputs
- source - accepts: any
Outputs
- result - produces: application/xml
Options
- limit - if > 0, step will count at most that many documents (e.g., can check if > 1 documents exist without processing all of them)
More information on p:count can be found in the XProc standard.
p:delete
Deletes items specified by the match option from the document passed to it and stores the resulting document on the result port.
Inputs
- source - accepts: xml html
Outputs
- result - produces: text xml html
Options
- match - requires an XSLT selection pattern
More information on p:delete can be found in the XProc standard.
p:error
Generates a dynamic error using the input passed to the step. The error can be caught using a p:try step - the result output port is an authoring convenience and is never actually updated.
Inputs
- source - accepts: text xml
Outputs
- result - produces: any
Options
- code - requires a unique code to identify the error
This step requires an external XProc engine like Morgana.
More information on p:error can be found in the XProc standard.
p:filter
Selects portions of the document passed to it based on an expression, and stores them on the result output port.
Inputs
- source - accepts: xml html
Outputs
- result - produces: text xml html
Options
- select - requires an expression to select content
This step requires an external XProc engine like Morgana.
More information on p:filter can be found in the XProc standard.
p:hash
Generates a hash, or digital "fingerprint", for the document passed to it, and injects it (using hexadecimal characters) into the source document. The result is stored on the result output port.
The match option is used to select nodes (the default pattern selects the document node). Each selected node is used to create a hash, and then is replaced by that hash. If the document node is replaced, the result is a text file that only contains the hash.
Inputs
- source - accepts: xml html
Outputs
- result - produces: text xml html
Options
- algorithm - requires one of: "crc", "md", or "sha"
- value - requires a string to use when creating the hash
- match - an XSLT selection pattern (default is '/*/node()')
- parameters - optional parameters (ref: Morgana)
- version - optional version of the algorithm used
This step requires an external XProc engine like Morgana.
More information on p:hash can be found in the XProc standard.
p:http-request
Used to call web APIs using http or https internet URLs. If the method (e.g., POST) supports a body, the request body is constructed using the document(s) passed to the source input port. The response from the call is stored on the result output port.
Details about the outcome of the request will appear as a map on the report output port. The map will contain entries like status-code, base-uri and headers.
Inputs
- source - accepts: any
Outputs
- result - produces: any
- report - produces: application/json
Options
- href - requires a URL to call
- assert - if this expression returns false report an error
- auth - optional map of authorization information (e.g., username, password, MD5 checksum)
- headers - map of HTTP header values
- method - HTTP request method which can be "GET", "POST", "HEAD", "PUT", "DELETE", "CONNECT", "OPTIONS" or "TRACE" (default is "GET")
- parameters - optional parameters (ref: XProc spec, Morgana)
- serialization - used to control serialization of request body during a POST request
More information on p:http-request can be found in the XProc standard.
p:identity
Stores an exact copy of what is passed to it on the result output port. These steps can provide a handy way to load documents for the next step using p:inline, p:pipe or p:document child instructions.
Inputs
- source - accepts: any
Outputs
- result - produces: any
More information on p:identity can be found in the XProc standard.
p:insert
Inserts the insertion input port's document into the source input port's document using the match selection pattern.
For every matched node, the insertion is made according to the position option.
Inputs
- source - accepts: xml html
- insertion - accepts: xml html
Outputs
- result - produces: xml html
Options
- match - an XSLT selection pattern (default is '/*')
- position - one of "first-child", "last-child", "before" or "after" (default is "after")
More information on p:insert can be found in the XProc standard.
p:json-join
Joins the sequence of documents passed to it into a single JSON document (an array) and stores it on the result output port. If any input documents are not JSON, they are automatically converted to JSON content if possible.
Inputs
- source - accepts: any
Outputs
- result - produces: application/json
Options
- flatten-to-depth - controls how content appearing on the source input port is flattened
This step requires an external XProc engine like Morgana.
More information on p:json-join can be found in the XProc standard.
p:json-merge
Merges the sequence of documents passed to it into a single JSON document (a map/object) and stores it on the result output port. If any input documents are not JSON, they are automatically converted to JSON content if possible.
Inputs
- source - accepts: any
Outputs
- result - produces: application/json
Options
- duplicates - one of "reject", "use-first", "use-last", "use-any" or "combine" (default is "use-first")
- key - expression used when merging sequences, arrays and maps to create unique keys (default is 'concat("_",$p:index)')
This step requires an external XProc engine like Morgana.
More information on p:json-merge can be found in the XProc standard.
p:label-elements
Generates a label for each matched element and stores that label in the specified attribute.
Inputs
- source - accepts: xml html
Outputs
- result - produces: xml html
Options
- attribute - name of the attribute to insert (default is 'xml:id')
- label - expression to create the label (default is 'concat("_",$p:index)')
- match - an XSLT selection pattern (default is '*')
- replace - if true allow replacement of existing attribute (default is true)
This step requires an external XProc engine like Morgana.
More information on p:label-elements can be found in the XProc standard.
p:load
Has no inputs, but stores the document specified by the href option on the result output port. The loaded document content type can be XML, JSON, HTML, text or "other" binary data.
Outputs
- result - produces: any
Options
- href - requires URI to load the document from
- content-type - can override the automatically detected content type
- document-properties - optional XProc document properties to apply
- parameters - these vary by loaded content type
More information on p:load can be found in the XProc standard.
p:make-absolute-uris
Makes an element or attribute's value in the document passed to it an absolute URI in the result document. For every node selected by the match option, its string value is resolved against the specified base URI and the resulting URI is used as the matched node's entire contents in the result document.
Inputs
- source - accepts: xml html
Outputs
- result - produces: xml html
Options
- match - requires an XSLT selection pattern
- base-uri - used to resolve attribute URIs
This step requires an external XProc engine like Morgana.
More information on p:make-absolute-uris can be found in the XProc standard.
p:namespace-delete
Deletes all namespaces identified by the specified prefixes from the document passed to it. The namespace declarations are removed, and any nodes that use those namespaces will have no namespace in the document stored on the result output port.
Inputs
- source - accepts: xml html
Outputs
- result - produces: xml html
Options
- prefixes - requires a list of namespace prefixes to delete
This step requires an external XProc engine like Morgana.
More information on p:namespace-delete can be found in the XProc standard.
p:namespace-rename
Renames any namespace declaration, or use of a namespace, in the document passed in to a new namespace URI. The command may affect elements, attributes or both according to the apply-to option.
Inputs
- source - accepts: xml html
Outputs
- result - produces: xml html
Options
- apply-to - one of "all", "elements" or "attributes" (default is "all")
- from - the namespace URI to be renamed which may be empty
- to - the new namespace URI which may be empty
This step requires an external XProc engine like Morgana.
More information on p:namespace-rename can be found in the XProc standard.
p:pack
Merges two document sequences into one. The step takes each pair of documents (one from source and one from alternate input ports), wraps them with a new element specified by the wrapper option, and writes that element to the result output port as a document.
If either input sequence is longer than the other, then wrap each of its remaining documents by themselves.
Inputs
- source - accepts: text xml html
- alternate - accepts: text xml html
Outputs
- result - produces: application/xml
Options
- wrapper - requires a name to wrap result documents in
This step requires an external XProc engine like Morgana.
More information on p:pack can be found in the XProc standard.
p:rename
This step renames elements, attributes, or processing-instruction targets. Each node selected by the match option is renamed to the name specified by the new-name option.
Inputs
- source - accepts: xml html
Outputs
- result - produces: xml html
Options
- new-name - requires a new name for the selected nodes
- match - an XSLT selection pattern (default is '/*')
This step requires an external XProc engine like Morgana.
More information on p:rename can be found in the XProc standard.
p:replace
Replaces nodes selected by the match option with the top-level node(s) of the replacement output port's document.
Inputs
- source - accepts: xml html
- replacement - accepts: text xml html
Outputs
- result - produces: text xml html
Options
- match - requires an XSLT selection pattern
More information on p:replace can be found in the XProc standard.
p:set-attributes
This step sets attributes on all elements selected by the match option. If an attribute of the same name already exists, it will be updated with a new value.
Inputs
- source - accepts: xml html
Outputs
- result - produces: xml html
Options
- attributes - requires a map of attribute names and values to set
- match - an XSLT selection pattern (default is '/*')
This step requires an external XProc engine like Morgana.
More information on p:set-attributes can be found in the XProc standard.
p:set-properties
This step sets XProc document properties on the passed in document and saves it to the result output port. The content of the document is not modified.
Inputs
- source - accepts: any
Outputs
- result - produces: any
Options
- properties - requires a map of property names and values to set
- merge - if true merge with existing properties, otherwise replace the entire set (default is true)
This step requires an external XProc engine like Morgana.
More information on p:set-properties can be found in the XProc standard.
p:sink
This step accepts a sequence of documents and discards them. It has no output.
It can be used to stop the default flow of documents from one step to the next, in particular when used with p:choose steps. For example, one p:when step allows content to flow to the next step, while another p:when or p:otherwise step does not.
Inputs
- source - accepts: any
More information on p:sink can be found in the XProc standard.
p:split-sequence
This step accepts a sequence of documents and divides it into two sequences. The test option expression is applied to each document on the source input port. If the result is true the document is copied to the matched output port, otherwise it is copied to the not-matched output port.
If the initial-only option is true, then when the first document that does not satisfy the test expression is encountered, it and all the documents that follow it are written to the not-matched output port.
Inputs
- source - accepts: any
Outputs
- matched - produces: any
- not-matched - produces: any
Options
- test - requires an expression to test each input document with
- initial-only - if true stop testing after the first fail (default is false)
This step requires an external XProc engine like Morgana.
More information on p:split-sequence can be found in the XProc standard.
p:store
Saves the input document to a URI. The URI specified by the href option must reference a local file using the "file:" URI scheme (e.g., "file:///c:/path/to/myfile.txt" on Windows, and "file:///path/to/myfile.txt" on Mac).
The result output port stores a copy of the document passed in (just like p:identity). The result-uri output port stores the location of the stored document in a c:result document.
Inputs
- source - accepts: any
Outputs
- result - produces: any
- result-uri - produces: application/xml
Options
- href - requires a URL pointing to a file location on the host computer or a local area network
- serialization - map of settings that can be used to modify how the result document is saved
More information on p:store can be found in the XProc standard.
p:string-replace
This step replaces all nodes selected by the match option with the replacement value. The replacement value is the string result of evaluating the expression in the replace option, using the matched node as the XPath context node.
If the document node is matched, the entire document is replaced by the string value of the replace expression. What appears on the result output port is a text document containing the replacement value.
Inputs
- source - accepts: xml html
Outputs
- result - produces: text xml html
Options
- match - requires an XSLT selection pattern
- replace - requires an expression to create the replacement value
More information on p:string-replace can be found in the XProc standard.
p:text-count
This step counts the number of lines in a text document and stores a single c:result XML document containing that number on the result output port.
Inputs
- source - accepts: text
Outputs
- result - produces: application/xml
This step requires an external XProc engine like Morgana.
More information on p:text-count can be found in the XProc standard.
p:text-head
Copies lines from the beginning of a text document to a text document on the result output port.
If the count option is positive, copy the first count lines. If it is zero, copy all lines. If it is negative, copy all lines except the first count lines.
Inputs
- source - accepts: text
Outputs
- result - produces: text
Options
- count - requires the number of lines to copy or skip
This step requires an external XProc engine like Morgana.
More information on p:text-head can be found in the XProc standard.
p:text-join
This step joins two or more text documents together, into one text document on the result output port. The separator option can be used to insert a string between documents.
If the prefix option is provided, it will start the result document even if no input documents are provided. The suffix option does the same thing to the end of the result.
The override-content-type option can be used to change the content type of the result document to a valid text-based type (e.g., CSV, JSON).
Inputs
- source - accepts: text
Outputs
- result - produces: text
Options
- override-content-type - can change the result document's content type
- prefix - a string to start the result document
- separator - a string to separate all documents
- suffix - a string to end the result document
This step requires an external XProc engine like Morgana.
More information on p:text-join can be found in the XProc standard.
p:text-replace
This step replaces all occurrences of substrings in a text document with a given replacement string.
The pattern option selects the substrings to replace. It must be a valid XPath regular expression. The flags option
may be used to refine how the pattern is interpreted.Inputs
- source - accepts: text
Outputs
- result - produces: text
Options
- pattern - requires a regular expression used to select substrings
- replacement - requires a string to replace matches with
- flags - can be used to interpret the pattern
This step requires an external XProc engine like Morgana.
More information on p:text-replace can be found in the XProc standard.
p:text-sort
This step sorts lines in a text document, and stores the result on the result output port.
The sort-key option is an expression that is applied to each line - the result is used to sort the lines.
The case-order defines whether upper-case letters are to be collated before or after lower-case letters.
Inputs
- source - accepts: text
Outputs
- result - produces: text
Options
- case-order - one of "upper-first" or "lower-first" (default is language-dependent)
- collation - how strings are compared with each other (ref: Morgana)
- lang - language whose collating conventions are to be used (default is from Windows or Mac computer)
- order - one of "ascending" or "descending" (default is "ascending")
- sort-key - expression used to sort the lines (default is '.')
- stable - if false, the order of lines that have the same sort key may change (default is true)
This step requires an external XProc engine like Morgana.
More information on p:text-sort can be found in the XProc standard.
p:text-tail
Copies lines from the end of a text document to a text document on the result output port.
If the count option is positive, copy the last count lines. If it is zero, copy all lines. If it is negative, copy all lines except the last count lines.
Inputs
- source - accepts: text
Outputs
- result - produces: text
Options
- count - requires the number of lines to copy or skip
This step requires an external XProc engine like Morgana.
More information on p:text-tail can be found in the XProc standard.
p:unarchive
This step accepts a single archive (e.g., ZIP) file, and copies zero or more of its entries to the result output port.
The include-filter and exclude-filter options can specify filters to include or exclude entries. If neither exists all entries are copied to the result. If include-filter(s) exist, entries that match them are copied to the result. If exclude-filter(s) exist, entries that match them are not copied to the result. If both exist, the include filter(s) are processed first.
Inputs
- source - accepts: any
Outputs
- result - produces: any
Options
- exclude-filter - sequence of strings to exclude that each must be a valid XPath regular expression
- format - can specify the archive format (e.g., ZIP), if it can't be automatically detected
- include-filter - sequence of strings to include that each must be a valid XPath regular expression
- override-content-types - used to partially override the content-type detection mechanism
- parameters - map of settings to control unarchiving (ref: Morgana)
- relative-to - used to create the base URI of unarchived documents
This step requires an external XProc engine like Morgana.
More information on p:unarchive can be found in the XProc standard.
p:uncompress
Accepts a compressed document (e.g., GZIP) and stores an uncompressed version on the result output port.
Inputs
- source - accepts: any
Outputs
- result - produces: any
Options
- content-type - can override the detected content type (default is application/octet-stream)
- format - can specify the archive format (e.g., GZIP), if it can't be automatically detected
- parameters - map of settings to control uncompression (ref: Morgana)
This step requires an external XProc engine like Morgana.
More information on p:uncompress can be found in the XProc standard.
p:unwrap
This step replaces matched elements with their children. The match option contains a pattern that must refer to document or element nodes. Every selected node is replaced by its children, effectively "unwrapping" the children from their parent.
This step can cause some unusual effects. For example, if a document node is selected which consists of a root element containing only text, the result is a document node with a single text node. In this case, the result document's content type will become text/plain.
Inputs
- source - accepts: xml html
Outputs
- result - produces: application/xml text/plain
Options
- match - an XSLT selection pattern (default is '/*')
This step requires an external XProc engine like Morgana.
More information on p:unwrap can be found in the XProc standard.
p:uuid
This step generates a UUID (guaranteed unique identifier string) and injects it into the source document. The version option may be used to select a specific version of the UUID algorithm. An example of a version 4 (default) UUID is "b98a4da4-9c90-48a7-9508-b25546d0a0f1".
The match option contains a pattern that selects nodes to replace with the UUID. If more than one node matches, the same UUID is used to replace each one. If the document node is selected, the result is a text document that only contains the UUID.
Inputs
- source - accepts: xml html
Outputs
- result - produces: text xml html
Options
- match - an XSLT selection pattern (default is '/*')
- version - UUID algorithm version (default is version 4 UUID)
This step requires an external XProc engine like Morgana.
More information on p:uuid can be found in the XProc standard.
p:wrap
This step wraps matching nodes in the source document with a new parent element. The match option selects nodes to wrap. The wrapper option provides the name of a new parent element.
Each matched node is replaced by an element named by the wrapper option, which contains a copy of the replaced node and all its descendants.
The group-adjacent option can be used to gather adjacent nodes to be wrapped. It contains an expression that is evaluated for each matching node, which must return sibling nodes (i.e., no nodes between) that will reside together in the new parent.
Inputs
- source - accepts: xml html
Outputs
- result - produces: application/xml
Options
- match - requires an XSLT selection pattern
- wrapper - requires a name for the new wrapper element
- group-adjacent - expression used to gather sibling nodes that will also reside in the new parent
This step requires an external XProc engine like Morgana.
More information on p:wrap can be found in the XProc standard.
p:wrap-sequence
This step accepts a sequence of documents and produces a single document, or optionally a new sequence of documents. The wrapper option provides the name of all document elements copied to the result output port.
Usually, all source documents end up in one result document. If the group-adjacent option is used, multiple documents could be created to group sequentially adjacent documents. It contains an expression that is evaluated for each source document, and will group sibling documents that have the same expression result.
Inputs
- source - accepts: text xml html
Outputs
- result - produces: application/xml
Options
- wrapper - requires a name for the new wrapper element
- group-adjacent - expression used to gather sibling documents that will also reside in the new parent
More information on p:wrap-sequence can be found in the XProc standard.
p:www-form-urldecode
This step decodes a x-www-form-urlencoded string into a JSON representation. A JSON map will appear on the result output port.
It does not have an input port - the value option must contain a string of parameter values encoded using the x-www-form-urlencoded algorithm. Each name/value pair is copied to the JSON map as a key/value entry.
Outputs
- result - produces: application/json
Options
- value - requires a x-www-form-urlencoded encoded value
This step requires an external XProc engine like Morgana.
More information on p:www-form-urldecode can be found in the XProc standard.
p:www-form-urlencode
This step encodes a set of parameter values as a x-www-form-urlencoded string. It does not have an input port - the parameters option contains a map of key/value pairs that are encoded.
Outputs
- result - produces: text/plain
Options
- parameters - requires a map of key/value pairs to be encoded
This step requires an external XProc engine like Morgana.
More information on p:www-form-urlencode can be found in the XProc standard.
p:xinclude
This step invokes an XInclude processor using the source document, which is assumed to contain XInclude instructions. The result is copied to the result output port.
Inputs
- source - accepts: xml html
Outputs
- result - produces: xml html
Options
- fixup-xml-base - if true base URI fixup will be performed
- fixup-xml-lang - if true language fixup will be performed
This step requires an external XProc engine like Morgana.
More information on p:xinclude can be found in the XProc standard.
p:xquery
This step invokes an XQuery processor on a sequence of source documents, using XQuery instructions provided on the query input port. The result document(s) is/are copied to the result output port.
Inputs
- source - accepts: any
- query - accepts: text xml
Outputs
- result - produces: any
Options
- parameters - used to set the query’s external variables
- version - may specify the version of XQuery used (default is 3.1)
More information on p:xquery can be found in the XProc standard.
p:xslt
This step invokes an XSLT processor on a sequence of source documents, using XSLT instructions provided on the stylesheet input port. The result document(s) is/are copied to the result output port.
The populate-default-collection option is used to control whether all source documents form the default collection for the XSLT transformation.
The secondary output port may contain secondary results from the transformation, which are defined in the XSLT stylesheet using the xsl:result-document instruction.
Inputs
- source - accepts: any
- stylesheet - accepts: xml
Outputs
- result - produces: any
- secondary - produces: any
Options
- global-context-item - used as global context item
- initial-mode - the initial mode for the invocation
- output-base-uri - sets the base output URI
- parameters - used to define top-level stylesheet parameters
- populate-default-collection - default collection instructions
- static-parameters - used to define static parameters
- template-name - initial template to invoke
- version - may specify the version of XSLT used (default is 3.0)
More information on p:xslt can be found in the XProc standard.
p:directory-list
This step lists the contents of a directory/folder on your computer or local network. The path option selects the directory using a URI (e.g., "file:///c:/path/to/myfile.txt" on Windows, and "file:///path/to/myfile.txt" on Mac). If a relative URI is provided, it is resolved against the directory containing the pipeline.
The result output port will contain a c:directory document, which contains c:file, c:directory and c:other entries like the following:
<c:file xml:base="file1.txt" name="file1.txt" />
The max-depth option may contain either the string "unbounded" or an integer. An integer value of 0 means that only information about the specified directory is returned. A value of 1 (the default), also returns information about the selected directory's immediate children. Larger values recurse deeper into subfolders.
The include-filter and exclude-filter options can specify filters to include or exclude entries. If neither exists all entries are added to the result. If include-filter(s) exist, entries that match them are added to the result. If exclude-filter(s) exist, entries that match them are not added to the result. If both exist, the include filter(s) are processed first.
Outputs
- result - produces: application/xml
Options
- path - requires a URI selecting the directory to scan
- detailed - if true, the result will contain more file/folder details (e.g., size, last-modified) - default is false, which means only name and xml:base attributes are included
- exclude-filter - sequence of strings to exclude that each must be a valid XPath regular expression
- include-filter - sequence of strings to include that each must be a valid XPath regular expression
- max-depth - "unbounded", or an integer indicating the maximum subfolder depth to recurse into (default is 1)
- override-content-types - used to partially override the content-type mechanism
This step requires an external XProc engine like Morgana.
More information on p:directory-list can be found in the XProc standard.
p:file-copy
This step copies a file or directory to a target location. The href option selects the file or folder to copy, and the target option selects the destination.
If the target is a non-existing folder, it will be created before copying begins. For file copying, if the target is a file, that file name will be used, otherwise the original file name will be retained. A similar approach is used for folders.
If the overwrite option is false, no existing file will be replaced.
If the copy is successful, a c:result document will be written to the result output port containing the absolute URI of the target. If an error occurs (e.g., no permission to save files); if the fail-on-error option is false the step returns a c:error document, otherwise an error is raised.
Outputs
- result - produces: application/xml
Options
- href - requires a URI selecting the source file/folder
- target - requires a URI selecting the destination to copy into
- fail-on-error - if false, return an error document instead of raising an error (default is true)
- overwrite - if false, prevent existing files from being replaced (default is true)
This step requires an external XProc engine like Morgana.
More information on p:file-copy can be found in the XProc standard.
p:file-create-tempfile
This step creates a temporary file. The temporary file is guaranteed not to already exist when the step is called.
If the href option contains the URI of an existing directory, the temp file will be created here. Otherwise, a Windows or Mac system folder will be used.
If the prefix option is provided, the temp file name will start with it. If the suffix option is provided, the temp file name will end with it.
If the temporary file is created successfully, a c:result document containing the absolute URI of this file is written to the result output port.
Outputs
- result - produces: application/xml
Options
- delete-on-exit - if true, attempt to delete temp file when the pipeline finishes running (default is false)
- fail-on-error - if false, return an error document instead of raising an error (default is true)
- href - URI to a directory where the temp file should be created
- prefix - start of the temp file name
- suffix - end of the temp file name
This step requires an external XProc engine like Morgana.
More information on p:file-create-tempfile can be found in the XProc standard.
p:file-delete
This step deletes a file or a directory identified by the href option. If a directory is selected, the recursive option must be true or the directory must be empty.
If successful, a c:result document containing the absolute URI of the file or directory deleted will be written to the result output port.
Outputs
- result - produces: application/xml
Options
- href - requires a URI to the file or folder being deleted
- fail-on-error - if false, return an error document instead of raising an error (default is true)
- recursive - if true and a folder is selected, also delete all child files and folders (default is false)
This step requires an external XProc engine like Morgana.
More information on p:file-delete can be found in the XProc standard.
p:file-info
This step returns information about a file, directory or other file system object identified by the href option.
If a file is identified, a c:file document is written to the result output port. It includes at least these attributes (name, readable, writable, hidden, last-modified, size and content-type).
If a folder is identified, a c:directory document is written to the result output port. It includes the same attributes as above.
If something other than a file or folder is identified, a c:other document is written to the result output port. It includes a name attribute.
Outputs
- result - produces: application/xml
Options
- href - requires a URI to the item being queried
- fail-on-error - if false, return an error document instead of raising an error (default is true)
- override-content-types - used to partially override the content-type mechanism
This step requires an external XProc engine like Morgana.
More information on p:file-info can be found in the XProc standard.
p:file-mkdir
This step creates a directory identified by the href option. If this command involves missing parent directories, they will be created automatically.
If successful, a c:result document is written to the result output port containing the absolute URI of the directory. If the directory already exists, nothing is done but the c:result document is still created.
Outputs
- result - produces: application/xml
Options
- href - requires a URI to the directory being created
- fail-on-error - if false, return an error document instead of raising an error (default is true)
This step requires an external XProc engine like Morgana.
More information on p:file-mkdir can be found in the XProc standard.
p:file-move
This step moves a file or directory identified by the href option, to a location identified by the target option.
If the target option specifies an existing directory, the step attempts to move a file or directory into that directory.
If the move is successful, a c:result document is written to the result output port containing the absolute URI of the target.
Outputs
- result - produces: application/xml
Options
- href - requires a URI to the file or directory being moved
- target - requires a URI to the directory into which items will be moved
- fail-on-error - if false, return an error document instead of raising an error (default is true)
This step requires an external XProc engine like Morgana.
More information on p:file-move can be found in the XProc standard.
p:file-touch
This step updates the modification timestamp of a file identified by the href option. If the specified file does not exist, an empty file will be created at that location.
If the timestamp option is set, the file's timestamp is set to this value. Otherwise, the file's timestamp is set to the current system's date and time.
Outputs
- result - produces: application/xml
Options
- href - requires a URI to the file being updated
- fail-on-error - if false, return an error document instead of raising an error (default is true)
- timestamp - timestamp to be used
This step requires an external XProc engine like Morgana.
More information on p:file-touch can be found in the XProc standard.
p:validate-with-relax-ng
This step validates XML or HTML source content using Relax NG (RNG) instructions, provided on the schema input port. This is the same format used in Tag for content generation. In the Scribe app *.rng files are called data setup files.
Errors and warnings are written to the report output port. If successful, the source document is copied to the result output port, possibly augmented by DTD compatibility or PSVI annotations.
Inputs
- source - accepts: xml html
- schema - accepts: text xml
Outputs
- result - produces: xml html
- report - produces: xml json
Options
- assert-valid - if true, raise an error if the input is not valid (default is true)
- dtd-attribute-values - if true, apply DTD compatibility conventions (default is false)
- dtd-id-idref-warnings - if true, report DTD compatibility errors (default is false)
- parameters - optional parameters (ref: Morgana)
- report-format - specify report format (default is 'xvrl')
This step requires an external XProc engine like Morgana.
More information on p:validate-with-relax-ng can be found in the XProc standard.
p:validate-with-schematron
This step validates XML or HTML source content using Schematron instructions, provided on the schema input port.
Errors and warnings are written to the report output port. If successful, the source document is copied to the result output port, possibly augmented by PSVI annotations.
Inputs
- source - accepts: xml html
- schema - accepts: xml
Outputs
- result - produces: xml html
- report - produces: xml json
Options
- assert-valid - if true, raise an error if the input is not valid (default is true)
- parameters - map containing Schematron external variables
- phase - starting Schematron validation phase
- report-format - specify report format (default is 'svrl')
This step requires an external XProc engine like Morgana.
More information on p:validate-with-schematron can be found in the XProc standard.
p:validate-with-xml-schema
This step validates XML or HTML source content using XML Schema instructions, provided on the schema input port.
Errors and warnings are written to the report output port. If successful, the source document is copied to the result output port, possibly augmented by PSVI annotations.
Inputs
- source - accepts: xml html
- schema - accepts: xml
Outputs
- result - produces: xml html
- report - produces: xml json
Options
- assert-valid - if true, raise an error if the input is not valid (default is true)
- mode - one of "strict" or "lax" (default is "strict")
- parameters - map containing external parameters
- report-format - specify report format (default is 'xvrl')
- try-namespaces - if true, attempt to dereference namespace URIs to locate schema documents (default is false)
- use-location-hints - if true, use schema location hints (default is false)
- version - version of XML Schema to be used
This step requires an external XProc engine like Morgana.
More information on p:validate-with-xml-schema can be found in the XProc standard.
tag:connector
This step uses a pre-defined Tag connector to call a web API. The connector must be loaded in the Connect app and is referenced using the ref option. Connectors can be imported and exported in the "Manage preferences" panel (top-right Account menu).
Connectors store all information needed to make a web API call including the URL, headers and user authentication information. When an apikey is required to authenticate web API users, Tag can securely save apikeys using preferences and access them via this step.
When content must be uploaded to the web API as part of a call (e.g., for HTTP POST requests), the connector must store the post body to upload. When the tag:connector step is used in a pipeline, the p:insert step can be used to update the post body before the call is made.
The output of this step depends on the web API called. The most common formats are JSON, XML and text. The response received from the web API is copied to the result output port as-is.
Outputs
- result - produces: any
Options
- ref - requires the name of a connector to run
This extension step only works in the Tag XProc engine.
More information on tag:connector can be found in nSymbol step documentation.
tag:csv
This step converts a CSV (comma-separated values) document into an XML document. A simple XML structure is created comprised of multiple <r> elements that each contains one child for every column.
CSV headers are read from the first row unless the read-headers option is false. Headers are used to name <r> child elements - if not available, <v> elements are used.
The namespace option may be used to define a namespace in the result XML.
A future version of Tag may extend this step to handle XML to CSV conversion.
Inputs
- source - accepts: text
Outputs
- result - produces: xml
Options
- namespace - a namespace URI for the result XML
- read-headers - if true, treat values in the first row as headers (default is true)
This extension step only works in the Tag XProc engine.
More information on tag:csv can be found in the nSymbol step documentation.
tag:docx
This step converts an XSL-FO document (the default rich text format in Tag) into a DOCX document (*.docx file) that can be opened in one of several popular word processors.
The output of this step is considered binary from a pipeline perspective. Typically, a p:store step is used to save it to a file.
Only a subset of format settings are converted, roughly corresponding to the available format tools in the Tag rich text editor.
A future version of Tag may extend this step to handle DOCX to XSL-FO conversion.
Inputs
- source - accepts: xml
Outputs
- result - produces: any
This extension step only works in the Tag XProc engine.
More information on tag:docx can be found in the nSymbol step documentation.
tag:google
This step allows you to call Google APIs if you have a Google business account. Google has a vast selection of APIs available to access Google resources like Drive, Docs, Sheets, Email and much more.
At a minimum, you need to provide the href and scope options for each API call. These are defined by Google documentation. API calls must be enabled in your Google Cloud account (see link to Tag docs below for more info).
When calling a Google API for the first time, a login challenge is made. You must be logged in to your Google account in a web browser. Tag will detect this, and open a web page that allows you to authorize the scope(s) required for that web API call (this is the same OAuth 2.0 permission granting mechanism used in mobile apps).
This permission can be reused many times, until it eventually expires and displays the permission form to you again. Importantly, it can be reused by other API calls that require the same scope.
The user option is normally not needed. It may be useful if you are calling multiple APIs with differing scopes. It is used to cache permissions on your computer.
The response from the API call is stored on the result output port. The report output port is used to store a JSON report if one is returned by the Google API.
Inputs
- source - accepts: any
Outputs
- result - produces: any
- report - produces: json
Options
- href - requires a Google API URI to call
- scope - requires a space-separated list of scope identifiers
- method - HTTP request method which can be "GET", "POST", "PUT", "PATCH" or "DELETE" (default is "GET")
- parameters - map of parameters expected by API
- user - optional user name used during login
This extension step only works in the Tag XProc engine.
More information on tag:google can be found in the nSymbol step documentation.
tag:html
This step converts an XSL-FO document (the default rich text format in Tag) into an HTML document (website page) that can be opened in any web browser.
The save-as-xhtml option allows you to save the result as an XHTML document, which is a form of pure XML. While Tag tries to treat HTML and XHTML in a consistent way, there may be situations (in particular with other software programs) where using XHTML provides an advantage.
The output of this step is HTML or XML, which can both be processed further by other pipeline steps. A p:store step can be used to save it to a file.
Only a subset of format settings are converted, roughly corresponding to the available format tools in the Tag rich text editor.
A future version of Tag may extend this step to handle HTML to XSL-FO conversion.
Inputs
- source - accepts: xml
Outputs
- result - produces: xml html
Options
- save-as-xhtml - if true, save the result as XHTML instead of HTML (default is false)
This extension step only works in the Tag XProc engine.
More information on tag:html can be found in the nSymbol step documentation.
tag:json-as-xml
This step converts a JSON document into an XML document.
There are two ways to perform this conversion which is controlled by the method option. The xpath method is the conversion method used by the XPath json-to-xml() function. It creates accurate, yet verbose, XML to represent the input JSON.
The other conversion method is jackson, which refers to the popular Jackson open source library. The XML created by Jackson is less verbose and may be more suitable for some purposes. This is the default method for this step. In some cases, this method will not be possible (due to complexity of the input JSON) and the xpath method will need to be used.
Inputs
- source - accepts: json
Outputs
- result - produces: xml
Options
- method - method to convert JSON to XML which must be "jackson" or "xpath" (default is "jackson")
This extension step only works in the Tag XProc engine.
More information on tag:json-as-xml can be found in the nSymbol step documentation.
tag:prompter
This step pauses execution of a pipeline to prompt the user for input.
The type option dictates what kind of prompter appears:
- confirm - displays a message with OK and Cancel buttons (returns "yes" or "no")
- info - displays an information message (returns an empty string)
- prompt - displays a message and prompts with a text box (returns a non-empty string or null)
- yes-no-cancel - displays a message with Yes, No and Cancel buttons (returns "yes", "no" or null)
If null is returned, the pipeline will stop running. All other values are wrapped in a c:result document and written to the result output port.
Outputs
- result - produces: xml
Options
- message - requires a message for the user
- prompt - initial value for the prompt type
- title - title for the prompter dialog
- type - type of prompter which must be "confirm", "info", "prompt" or "yes-no-cancel" (default is "prompt")
This extension step only works in the Tag XProc engine.
More information on tag:prompter can be found in the nSymbol step documentation.
tag:sleep
This step pauses execution of the pipeline for a specific duration of time. It can be used to simulate longer-running steps for demos, or during prototype development.
Options
- millis - the number of milliseconds to sleep (default is 500)
This extension step only works in the Tag XProc engine.
More information on tag:sleep can be found in the nSymbol step documentation.
tag:sparql
This step reads remote SPARQL endpoints (semantic databases). A text document containing a SPARQL query is passed in, and used to query a SPARQL endpoint using the server URI and some additional settings.
Note that the query can be generated using logic and/or data by prior steps in the pipeline. This is a very powerful way to access SPARQL endpoints.
The result is saved as XML in a similar way to tag:sql. Each row in the result set creates a repeating element, which has child elements for all returned variables. There is no guarantee that all repeating elements have exactly the same child elements.
Inputs
- source - accepts: text
Outputs
- result - produces: xml
Options
- server - requires the endpoint's URI
- password - password if needed
- port - port number if needed
- user - user name if needed
This extension step only works in the Tag XProc engine.
More information on tag:sparql can be found in the nSymbol step documentation.
tag:sql
This step reads local or remote SQL databases. A text document containing a SQL query is passed in, and used to query a SQL database using the type option ("access", "mysql" or "sql-server"), the server option URI, and some additional settings.
Note that the query can be generated using logic and/or data by prior steps in the pipeline. This is a very powerful way to access SQL databases.
The result is saved as XML where each row in the result set creates a repeating element, which has child elements for all result columns. All repeating elements have the same child elements, although some may be empty.
Inputs
- source - accepts: text
Outputs
- result - produces: xml
Options
- server - requires the server URI
- type - requires type of SQL database which must be "access", "mysql" or "sql-server"
- database - database name if needed
- password - password if needed
- port - port number if needed
- user - user name if needed
This extension step only works in the Tag XProc engine.
More information on tag:sql can be found in the nSymbol step documentation.
tag:xml-as-json
This step converts an XML document into a JSON document.
There are two ways to perform this conversion which is determined by the input XML. If the XML references the "http://www.w3.org/2005/xpath-functions" namespace, it is converted to JSON exactly like the XPath xml-to-json() function.
If that namespace is not present, the Jackson open source library is used to perform the conversion. If Jackson is unable to perform the conversion, an error is reported and the pipeline will stop.
The save-as-array option may be used during Jackson conversion. Jackson can't handle multiple map siblings with same name, and some data is not preserved. Instead, this option stores an expression that will "flatten" the XML structure into something that converts to an array (e.g., the expression selects a list of repeating elements from somewhere within the XML hierarchy).
Inputs
- source - accepts: xml
Outputs
- result - produces: json
Options
- save-as-array - an expression to select repeating elements for a Jackson conversion
This extension step only works in the Tag XProc engine.
More information on tag:xml-as-json can be found in the nSymbol step documentation.