DataFire integration for api.clarify.io
npm install @datafire/clarifyClient library for api.clarify.io
bash
npm install --save @datafire/clarify
`
`js
let clarify = require('@datafire/clarify').create();.then(data => {
console.log(data);
});
`Description
The API to Search and Understand Audio & Video Data.
Actions
$3
Gets the list of bundles. Links to each item are in the _links with link relation items.
After getting the initial list, use the first, last, next, prev link relations to get more bundles in the list. Note that next will not be available at the end of the list and prev will not be available at the start of the list. If the results are exactly one page neither prev nor next will be available.
The embed parameter specifies link relations to embed in the results. The models for the specified link relations will be in an array in the embedded object with the link relation as the key. For example, if you do embed=items, _embedded will contain a property items whose value is the array of bundle models. For link relations that are curies (ex. "clarify:metadata"), you may simply use the base name (ex. "metadata").
`js
clarify.v1.bundles.get({}, context)
`#### Input
* input
object
* limit integer: limit results to specified number of bundles. Default is 10. Max 100.
* embed string: list of link relations to embed in the result collection. Zero or more of: items, tracks, metadata, insights. List is space or comma separated single string or an array of strings
* iterator string: optional opaque value, automatically provided in next/prev links, or literal "first", "last"#### Output
* output Collection
$3
Create a new bundle with the specified name, media url, and optional JSON metadata.
name can be any string you wish to associate with the bundle.
media_url must be a publicly accessible url to a media file. It will be fetched asynchronously after the REST call returns. The audio can be mono or stereo.
audio_channel is used to specify audio channels if the media is a stereo file. A value of left or right signifies that only the specified channel will be used. If no value or an empty string is specified for audio_channel, all channels will be used in a single track. If your stereo channels were recorded separately with each channel containing distinct content (for example if 2 legs of a phone call were recorded separately and combined into a single stereo file), for best speech recognition, create two tracks, with audio_channel set to left and right in each track respectively. If your stereo file is simply a recording made with a stereo microphone, audio_channel should be set to an empty string (or not be specified.) If you have audio channels as separate media files, after creating the bundle with one media_url, POST another media_url to /bundles/{bundle_id}/tracks.
audio_language can be used to specify the language of the audio media. This is an optional parameter and if not specified or an empty string, the language of the track will be automatically detected. If specified, it must be a language code as described in RFC5646 (see http://tools.ietf.org/html/rfc5646). Supported languages: en-US, en-UK, es, fr.
label is a short name for the track.
metadata is a single-level JSON object of your own definition, containing key-values that can be searched and filtered on. Metadata can be used to hold text such as names, titles, descriptions and values for segregating bundles, for example by user, topic, folder name etc. The keys (property names) can be up to 64 characters and must contain only alphanumeric characters and underscore (but not start with underscore) and must not be a reserved name. Reserved names are "true", "false", and "null". Values can be strings, numbers, boolean true/false, date-times represented as a string in ISO 8601 format (ex. "2014-02-25T14:23:45.000Z"), or an array of these primitive types. Strings can be up to 2000 characters and strings in arrays can be up to 128 characters each. Nested objects are not allowed. Metadata can contain up to 50 key-value pairs up to a total JSON size of 4000 characters.
start_time a time in seconds that the media starts, relative to start time of the bundle. This allows you to specify sequential parts of media. If not specified, the default is 0.
parts_pending a boolean flag specifying if more media parts will subsequently be added to the track. If true, a subsequent API call must be made to signify that the track is complete. If not specified, the default is false.
external_id is an optional parameter that can be used to logically link a bundle to an item in an external system. The external_id can be whatever you use to identify items in your own database.
notify_url is a webhook. It must be a publicly accessible url (http or https) on your server to which notifications for the bundle will be POSTed. There are three types of notifications: Track Notifications, Insight Notifications and Bundle Notifications. For more information on the content of notifications and when they are sent, see the notification docs page.
If a track was created along with the budle, the link relation clarify:track will be included with a link to the new track.
`js
clarify.v1.bundles.post({}, context)
`#### Input
* input
object
* name string: Name of the bundle. Up to 128 characters.
* media_url string: URL of a media (audio or video) file for this bundle. Up to 2083 characters.
* audio_channel string (values: left, right): The audio channel to use for the track ( "" | left | right ). Default is empty string which means all channels of audio in the media file are used for the track.
* audio_language string (values: en-US, en-UK, es, fr): Language of the audio in the track, specified with an RFC5646 code.
* start_time number: Time offset in seconds that the media starts relative to the bundle. Default is 0.
* parts_pending boolean: Set to true if more media parts will be added to the track. Default is false.
* label string: Label for the track (if media_url is specified.) Up to 128 characters.
* metadata string: User-defined JSON data associated with the bundle. Must be valid JSON, up to 4000 characters.
* notify_url string: URL for notifications on this bundle. Up to 2083 characters.
* external_id string: A string that can refer to an item in an external system. Up to 128 characters.#### Output
* output Ref%20(of%20Bundle))
$3
Delete a bundle and its related metadata and tracks. This will only delete media stored on Clarify systems and not delete the source media on remote systems.
Successful response will be a HTTP code 204 with an empty body.
`js
clarify.v1.bundles.bundle_id.delete({
"bundle_id": ""
}, context)
`#### Input
* input
object
* bundle_id required string: id of a bundle#### Output
Output schema unknown
$3
Get a bundle that has previously been created.
`js
clarify.v1.bundles.bundle_id.get({
"bundle_id": ""
}, context)
`#### Input
* input
object
* bundle_id required string: id of a bundle
* embed string: list of link relations to embed in the result bundle. Zero or more of: tracks, metadata, insights. List is space or comma separated single string or an array of strings#### Output
* output Bundle
$3
Update a bundle. To update the tracks, media, or metadata of a bundle, use the tracks and metadata endpoints.
name can be any string you wish to associate with the bundle.
external_id is an optional parameter that can be used to logically link a bundle to an item in an external system. The external_id can be whatever you use to identify items in your own database.
notify_url is a webhook. It must be a publicly accessible url (http or https) on your server to which notifications for the bundle will be POSTed. There are three types of notifications: Track Notifications, Insight Notifications and Bundle Notifications. For more information on the content of notifications and when they are sent, see the notification docs page.
If version is specified, the bundle will only be updated if the current version matches this parameter value. If the version doesn't match, a 409 Conflict error will be returned. If version not specified, the bundle will always be updated.
`js
clarify.v1.bundles.bundle_id.put({
"bundle_id": ""
}, context)
`#### Input
* input
object
* bundle_id required string: id of a bundle
* name string: Name of the bundle. Up to 128 characters.
* notify_url string: URL for notifications on this bundle. Up to 2083 characters.
* external_id string: A string that can refer to an item in an external system. Up to 128 characters.
* version integer: Object version.#### Output
* output Ref%20(of%20Bundle))
$3
Gets the insights for a bundle.
URLs of the available insights for the bundle are in the _links object, with the link relations (keys) of the format insight:insight_name.
Documentation on the insights available and the data returned can be found at http://docs.clarify.io/insights/
`js
clarify.v1.bundles.bundle_id.insights.get({
"bundle_id": ""
}, context)
`#### Input
* input
object
* bundle_id required string: id of a bundle#### Output
* output Insights
$3
Request an insight to be run on a bundle. Note that most insights are set to automatically run on all bundles so you commonly won't need to call this endpoint except to request transcripts. To configure which insights are automatically run for an app, visit the Clarify Developer Portal.
Insights that are not configured to autorun can be requested to run on an individual bundle using this endpoint. The following insights can be requested:
transcript_r9 - High-accuracy transcript of the speech in audio media.
Transcripts will produced on the mixed audio of all tracks in the bundle and are charged per minute (rounded up for partial minutes), based on the duration of the longest track. If the request has already been made, this method has no effect other than to return the existing insight.
Transcripts will typically take about 48 hours. When the transcript is ready, an InsightNotification webhook will be POSTed to the bundle notify_url.
For more information see Human Transcripts Quick Start.
captions_r9 - High-accuracy captions of the speech in video media.
Captions will be generated on the first track in the bundle. and are charged per minute (rounded up for partial minutes), based on the duration of the media. See the pricing page. If the request has already been made, this method has no effect other than to return the existing insight.
Captions will typically take about 72 hours. When the captions are ready, an InsightNotification webhook will be POSTed to the bundle notify_url.
For more information see Captions Quick Start.
spoken_keywords - Spoken words of interest found in audio media. Note: Normally spoken_keywords is set to autorun so you do not need to run it explicitly.
spoken_topics - Topics spoken about in the audio media.
`js
clarify.v1.bundles.bundle_id.insights.post({
"bundle_id": "",
"insight": ""
}, context)
`#### Input
* input
object
* bundle_id required string: id of a bundle
* insight required string (values: transcript_r9, captions_r9, spoken_keywords, spoken_topics, spoken_words): name of the insight: transcript_r9, captions_r9, spoken_keywords, spoken_topics, spoken_words#### Output
* output Insight
$3
Gets a particular insight for a bundle. Typically, you will hit this endpoint from a link contained in a response to /v1/bundles/{bundle_id}/insights
The insight response may contain a data object containing insight-specific data and/or an array of objects called track_data, where the array indexes correspond to the tracks in the bundle. Each object in the array contains the track_id, track_label and insight-specific data related to that insight. For example, in the spoken_words insight, the track_data objects contain the field word_count which is the number of spoken words found in the track.
Documentation on the insights available and the data returned can be found at http://docs.clarify.io/insights/
Insights that contain data in different file formats (such as for video captions) will have one or more link relations in the _links array for the corresponding data. Note that the href URLs in these links have a limited lifespan and should not be stored locally.
`js
clarify.v1bundlesbundle_idinsightsinsight_id({
"bundle_id": "",
"insight_id": ""
}, context)
`#### Input
* input
object
* bundle_id required string: id of a bundle
* insight_id required string: id of an insight#### Output
* output Insight
$3
Delete the metadata of a bundle and set data to {} (empty object.) This is functionally equivalent to an update metadata request with data set to {}.
Successful response will be a HTTP code 204 with an empty body.
`js
clarify.v1.bundles.bundle_id.metadata.delete({
"bundle_id": ""
}, context)
`#### Input
* input
object
* bundle_id required string: id of a bundle#### Output
Output schema unknown
$3
Gets the metadata for a bundle.
`js
clarify.v1.bundles.bundle_id.metadata.get({
"bundle_id": ""
}, context)
`#### Input
* input
object
* bundle_id required string: id of a bundle#### Output
* output Metadata
$3
Update the metadata for a bundle.
The metadata is a single-level JSON object of your own definition, containing key-values that can be searched and filtered on. Metadata can be used to hold text such as names, titles, descriptions and values for segregating bundles, for example by user, topic, folder name etc. The keys (property names) can be up to 64 characters and must contain only alphanumeric characters and underscore (but not start with underscore) and must not be a reserved name. Reserved names are "true", "false", and "null". Values can be strings, numbers, boolean true/false, date-times represented as a string in ISO 8601 format (ex. "2014-02-25T14:23:45.000Z"), or an array of these primitive types. Strings can be up to 2000 characters and strings in arrays can be up to 128 characters each. Nested objects are not allowed. Metadata can contain up to 50 key-value pairs up to a total JSON size of 4000 characters.
To clear the metadata for a bundle, send data={}.
If version specified, the metadata will only be updated if the current version matches this parameter value. If the version doesn't match, a 409 Conflict will be returned. If version not specified, the metadata will always be updated.
`js
clarify.v1.bundles.bundle_id.metadata.put({
"bundle_id": "",
"data": ""
}, context)
`#### Input
* input
object
* bundle_id required string: id of a bundle
* data required string: User-defined JSON data associated with the bundle. Must be valid JSON, up to 4000 characters.
* version integer: Object version.#### Output
* output Ref
$3
Delete tracks of a bundle. This will only delete media stored on Clarify systems and not delete the source media on remote systems.
Successful response will be a HTTP code 204 with an empty body.
`js
clarify.v1.bundles.bundle_id.tracks.delete({
"bundle_id": ""
}, context)
`#### Input
* input
object
* bundle_id required string: id of a bundle#### Output
Output schema unknown
$3
Gets the array of tracks for a bundle. This includes the specification of the media and the status of fetching and processing it.
Media for tracks is fetched asynchronously. Until media has been retrieved, a track's duration and size will both be set to -1.
`js
clarify.v1.bundles.bundle_id.tracks.get({
"bundle_id": ""
}, context)
`#### Input
* input
object
* bundle_id required string: id of a bundle#### Output
* output Tracks
$3
Add a new track to a bundle. This will insert or append a new track in the tracks array or return an error if the maximum number of tracks (12) has been reached or the track number specifies an invalid index.
Once all media parts have been added to a track it is immutable, meaning it cannot be modified. If you wish to modify a track, simply add a new one and delete the existing one.
label is a short name for the track.
media_url must be a publicly accessible url to a media file. It will be fetched asynchronously after the REST call returns. The audio can be mono or stereo.
audio_channel is used to specify audio channels if the media is a stereo file. A value of left or right signifies that only the specified channel will be used. If no value or an empty string is specified for audio_channel, all channels will be used in a single track. If your stereo channels were recorded separately with each channel containing distinct content (for example if 2 legs of a phone call were recorded separately and combined into a single stereo file), for best speech recognition, create two tracks with audio_channel to be left and right. If your stereo file is simply a recording made with a stereo microphone, audio_channel should be set to an empty string (or not be specified.)
audio_language can be used to specify the language of the audio media. This is an optional parameter and if not specified or an empty string, the language of the track will be automatically detected. If specified, it must be a language code as described in RFC5646 (see http://tools.ietf.org/html/rfc5646). Supported languages: en-US, en-UK, es, fr.
start_time a time in seconds that the media starts, relative to start time of the bundle. This allows you to specify sequential parts of media. If not specified, the default is 0.
parts_pending a boolean flag specifying if more media parts will subsequently be added to the track. If true, a subsequent API call must be made to signify that the track is complete. If not specified, the default is false.
track is the index in the tracks array where the new track will be added. Track numbers start at 0. If this parameter is not specified the new track will always be appended to the end of the array. If the track specified is greater than the last index of the array + 1, an error will be returned.
If version specified, the track will only be added if the current version matches this parameter value. If the version doesn't match, a 409 Conflict error will be returned. If version not specified, the track will always be updated.
`js
clarify.v1.bundles.bundle_id.tracks.post({
"bundle_id": "",
"media_url": ""
}, context)
`#### Input
* input
object
* bundle_id required string: id of a bundle
* label string: Label for the track. Up to 128 characters.
* media_url required string: URL of a media file for this bundle. Up to 2083 characters.
* audio_channel string (values: left, right): The audio channel to use for the track ( "" | left | right ). Default is empty string which means all channels of audio in the media file are used for the track.
* audio_language string (values: en-US, en-UK, es, fr): Language of the audio in the track, specified with an RFC5646 code.
* start_time number: Time offset in seconds that the media starts relative to the bundle. Default is 0.
* parts_pending boolean: Set to true if more media parts will be added to the track. Default is false.
* track integer: Track number specifies the index of the new track in the tracks array. An integer from 0 to 11. If not specified, the new track is appended to the array.
* version integer: Object version.#### Output
* output Ref%20(of%20Track))
$3
Update tracks for a bundle.
parts_complete a boolean true or false . If true, any tracks in the PENDING state will be queued for processing and no more media parts may be added to the tracks. Default is false.
`js`
clarify.v1.bundles.bundle_id.tracks.put({
"bundle_id": "",
"parts_complete": true
}, context)
#### Input
* input objectstring
* bundle_id required : id of a bundleboolean
* parts_complete required : Set to true if media parts in all tracks are complete. Default is false.integer
* version : Object version.
#### Output
* output Ref
`js`
clarify.v1.bundles.bundle_id.tracks.track_id.delete({
"bundle_id": "",
"track_id": ""
}, context)
#### Input
* input objectstring
* bundle_id required : id of a bundlestring
* track_id required : id of a track
#### Output
Output schema unknown
`js`
clarify.v1.bundles.bundle_id.tracks.track_id.get({
"bundle_id": "",
"track_id": ""
}, context)
#### Input
* input objectstring
* bundle_id required : id of a bundlestring
* track_id required : id of a track
#### Output
* output Track
`js`
clarify.v1.bundles.bundle_id.tracks.track_id.put({
"bundle_id": "",
"track_id": "",
"media_url": ""
}, context)
#### Input
* input objectstring
* bundle_id required : id of a bundlestring
* track_id required : id of a trackstring
* media_url required : URL of a media file for this bundle. Up to 2083 characters.string
* audio_channel (values: left, right): The audio channel to use for the track ( "" | left | right ). Default is empty string which means all channels of audio in the media file are used for the track.string
* audio_language (values: en-US, en-UK, es, fr): Language of the audio in the track, specified with an RFC5646 code.number
* start_time : Time offset in seconds that the media starts relative to the bundle. Default is 0.boolean
* parts_pending : Set to true if more media parts will be added to the track. Default is false.integer
* version : Object version.
#### Output
* output Ref%20(of%20Track))
field-name comparison-operator literal-value where:field-name is a metadata field or bundle.name, bundle.id, bundle.external_id, bundle.created, or bundle.updated.comparison-operator is ==, <, >, <=, >=, or !=literal-value is a number (integer or decimal), boolean true or false, or a string with either double quotes (") or single quotes (').&& (logical AND), || (logical OR). A logical NOT is ! and can be placed before a term (or group of terms.)category=="music" && (tag == "soft" || tag == "smooth") && tag != "jazz" && bundle.created > "2014-03-15T00:00:00.0Z"language parameter specifies the language to use for analyzing the report. This value is only relevant for language-related insight data. Supported languages: en, en-UK, en-US, es, fr.
`js`
clarify.v1reportsscores({
"interval": "",
"score_field": "",
"group_field": ""
}, context)
#### Input
* input objectstring
* interval required (values: year, quarter, month, week, day, hour): Duration of report periods. Default is month.string
* score_field required : A bundle/metadata field to use as a score. Ex. insights.spoken_words.listener_score.string
* group_field required : A metadata field by which to group scores, typically a user or team id field.string
* filter : filter expression, typically programmatically generated based on input controls and data segregation rules etc. Up to 500 characters.string
* language (values: en, en-UK, en-US, es, fr): Language to search in, specified with an RFC5646 code. Default is "en"
#### Output
* output BundleReport
field-name comparison-operator literal-value where:field-name is a metadata field or bundle.name, bundle.id, bundle.external_id, bundle.created, or bundle.updated.comparison-operator is ==, <, >, <=, >=, or !=literal-value is a number (integer or decimal), boolean true or false, or a string with either double quotes (") or single quotes (').&& (logical AND), || (logical OR). A logical NOT is ! and can be placed before a term (or group of terms.)category=="music" && (tag == "soft" || tag == "smooth") && tag != "jazz" && bundle.created > "2014-03-15T00:00:00.0Z"language parameter specifies the language to use for analyzing the report. This value is only relevant for language-related insight data. Supported languages: en, en-UK, en-US, es, fr.
`js`
clarify.v1reportstrends({
"interval": ""
}, context)
#### Input
* input objectstring
* interval required (values: year, quarter, month, week, day, hour): Duration of report periods. Default is month.string
* content : Content reported in each period. Zero or more of tracks, spoken_words, spoken_keywords. List is space or comma separated single string or an array of strings.string
* filter : filter expression, typically programmatically generated based on input controls and data segregation rules etc. Up to 500 characters.string
* language (values: en, en-UK, en-US, es, fr): Language to search in, specified with an RFC5646 code. Default is "en"
#### Output
* output BundleReport
open voice) which will find all bundles matching all the words. To search for a phrase, put it in quotes (ex. "open source") You can exclude bundles that contain a word by putting a minus (hyphen) in front of the word (ex. -opaque) To search for one word or another, use OR (in uppercase) between the words (ex. pizza OR pasta). As an alternative to OR, you can use | (pipe character). A full query could look something like: restaurant "little italy" pizza OR pasta -mushrooms| query_fields | Bundle data searched | |
| all data | This is the default value. | |
| insights.spoken_words | [spoken words] | All audio tracks are searched. |
| fieldname | metadata.fieldname | Your custom metadata field. Wildcard metadata. searches all metadata fields. |
| bundle.fieldname | bundle.fieldname | The searchable bundle fieldnames are name, id, external_id, created and updated. Wildcard bundle.* searches all bundle fields |
field-name comparison-operator literal-value where:field-name is a metadata field or bundle.name, bundle.id, bundle.external_id, bundle.created, or bundle.updated.comparison-operator is ==, <, >, <=, >=, or !=literal-value is a number (integer or decimal), boolean true or false, or a string with either double quotes (") or single quotes (').&& (logical AND), || (logical OR). A logical NOT is ! and can be placed before a term (or group of terms.)category=="music" && (tag == "soft" || tag == "smooth") && tag != "jazz" && bundle.created > "2014-03-15T00:00:00.0Z"language parameter specifies the language of the words in the search query. This value is used for word-stemming etc. while searching text. Regardless of what you set for this parameter, all your bundles will be searched, no matter what language content they contain. Supported languages: en, en-UK, en-US, es, fr.
After getting the initial list, use the first, next, prev link relations to get more bundles in the list. Note that next will not be available at the end of the list and prev will not be available at the start of the list. A maximum of limit items will be returned. If the results are exactly one page neither prev nor next will be available.
The embed parameter specifies link relations to embed in the results. For link relations that are curies (ex. "clarify:metadata"), you may simply use the base name (ex. "metadata").
`js`
clarify.v1search({}, context)
#### Input
* input objectstring
* query : search terms, typically as typed into a search field. Up to 120 characters.string
query_fields : list of insights, metadata, and bundle fields to search with the query. Use insights.spoken_words for searching audio, metadata. for all metadata fields, bundle. for all bundle fields, for audio and all fields. Default is insights.spoken_words and metadata.*. List is space or comma separated single string or an array of strings. If single string, up to 1024 characters.string
* filter : filter expression, typically programmatically generated based on input controls and data segregation rules etc. Up to 500 characters.string
* language (values: en, en-UK, en-US, es, fr): Language to search in, specified with an RFC5646 code. Default is "en"integer
* limit : limit results to specified number of bundles. Default is 10. Max 100.string
* embed : list of link relations to embed in the result collection. Zero or more of: items, tracks, metadata, insights. List is space or comma separated single string or an array of stringsstring
* iterator : opaque value, automatically provided in next/prev links
#### Output
* output SearchCollection