2023:CMISQL Query
CMIS Queries are utilized to search documents in CMIS Repositories and to filter documents upon import when using the Import Query Results provider.
| Previous Versions |
|---|
Also called a "CMISQL query", the querying language CMIS Queries use is based on a subset of the SQL-92 grammar. Where SQL is a querying language to search and select data in a database, "CMISQL" is a querying language to search and select documents (and their metadata properties) in a storage location, represented by a CMIS Repository in Grooper.
About
CMIS Queries use clauses to retrieve content from a CMIS Repositories based on certain metadata property values, much like a SQL query uses clauses to retrieve data from a database based on column values. In both cases, a SELECT clause is used to select content based on certain conditions determined by a WHERE clause
In general the CMISQL statement takes the following form:
SELECT * FROM <Content Type> WHERE <Conditions> ORDER BY <Sort Criteria>
For example, if you wanted to retrieve all email messages in your Inbox folder over an Exchange CMIS Connection from a particular sender and order results by the date you received the email (oldest first), you could use the following query:
SELECT * FROM Message WHERE (Sender LIKE '%user@example.com' AND IN_FOLDER('/Inbox')) ORDER BY DateTimeReceived ASC
Next, let's break down this query further and look at what each clause is doing.
Clause Information
SELECT
SELECT * FROM Message
WHERE (Sender LIKE '%user@example.com' AND IN_FOLDER('/Inbox'))
ORDER BY DateTimeReceived ASC
The SELECT clause defines a set of property values to be returned with each query result. "Property values" refers to the available metadata properties for the object type selected for the From statement. For example, when querying email messages using in an Exchange CMIS Repository, there are metadata properties for the sender, the subject line, the date the email was sent, and more. All these metadata properties are part and parcel to the email message file itself.
From a practical standpoint, your CMIS Query is ultimately returning documents. Files in a digital storage location. You aren't retrieving individual metadata values, rather the whole file with all its various metadata included. To that end you will use the * character to select all property values, when crafting a CMISQL query.
You will always start your CMIS Query with SELECT * when using a CMIS Query to import documents.
| FYI |
When would you use something besides The answer is CMIS Lookups. A CMIS Lookup operates much like a database lookup. Instead of querying a database table's columns for data, a CMIS Query queries the metadata properties of documents in a storage location.
|
FROM
SELECT * FROM Message
WHERE (Sender LIKE '%user@example.com' AND IN_FOLDER('/Inbox'))
ORDER BY DateTimeReceived ASC
The FROM clause determines what type of content you're searching for, such as a digital file or a folder and their metadata properties.
- The type of content you select determines what metadata properties you have access to when building the conditional logic in your query's
WHEREclause.
Depending on the CMIS Connection type, you will have different options to select from. Some have simple "Document" and "Folder" content types, representing documents and folders respectively. Some have a multitude of content types, each with metadata properties specific to the CMIS Binding.
- For example, the Exchange binding's Message content type has property values relating to email messages, such as Subject, Sender and DateTimeSent.
- For example, The SharePoint binding's content types can be Document Libraries, giving you access to custom metadata properties you've created on the SharePoint site by adding custom columns to the Document Library.
In our example, we wanted to query email messages in an Exchange repository. So, we selected the Message type by adding FROM Message to our query.
WHERE
SELECT * FROM Message
WHERE (Sender LIKE '%user@example.com' AND IN_FOLDER('/Inbox'))
ORDER BY DateTimeReceived ASC
The WHERE clause defines what search conditions must be met for a document to be retrieved. Each condition is defined by a "predicate". You can use any metadata available to the content type (as determined by the FROM clause) in your query conditions.
- Multiple conditions/predicates can be joined with the
ANDorORoperators. - You can manipulate the logical order of operations by using nested parenthesis.
- You can logically negate predicates using the
NOToperator.
The following is a list of predicates:
| Predicate | Description | Example |
| Comparison Predicate |
Specifies a condition for an individual property using comparisons, such as "equals to"
|
DateTimeCreated < '12/31/2007'
|
| In Predicate | Specifies a list of allowed values for a property. This list is separated by commas. | FileExtension IN ('.pdf', '.docx', '.xlsx')
|
| Contains Predicate | Specifies a full-text query.
|
CONTAINS('mortgage AND payment AND NOT vehicle')
|
| Scope Predicate | Restricts the search scope to a specific folder (IN_FOLDER) or a specific folder including its subfolders(IN_TREE) |
IN_FOLDER('/Documents/Grooper')
|
- Note: Not every property type may be able to utilize every predicate/operators within a predicate. For example, the Subject property on the Exchange binding cannot use the
=operator when crafting a Comparison Predicate. - Note: The
NOToperator cannot be used with theIN_FOLDERorIN_TREEScope Predicate.
In our example, we used a combination of the Comparison Predicate and Scope Predicate to only retrieve email messages from certain sender (Sender LIKE '%user@example.com') and in our inbox folder ((IN_FOLDER('/Inbox/')) Since we wanted to filter content by multiple criteria, the two conditions were joined with the AND operator.
ORDER BY
SELECT * FROM Message
WHERE (Sender LIKE '%user@example.com' AND IN_FOLDER('/Inbox'))
ORDER BY DateTimeReceived ASC
The ORDER BY clause is an optional clause which allows you to specify the order in which results are returned. You can sort by multiple properties using a comma separated list. Each property name may be followed by ASC or DESC to indicate ascending or descending sort direction.
- The default sort direction is ascending. However, it is still considered best practice to include the
ASCtag to properties you wish to sort in ascending order.
In our example, we wanted to sort our incoming messages by oldest received first.
CMIS Query Editor
If you are not familiar with the CMISQL syntax (or even SQL), you may find it easier to use the CMIS Query Editor, instead of writing the full query yourself. This allows you to configure the CMISQL query using a series of property grids and menu selections instead of writing out the query from scratch.
The CMIS Query Editor is accessible in two locations:
- When configuring the Import Query Results Import Provider's CMIS Query property.
- Using a CMIS Repository's "Search Repository" tab.
| ⚠ |
Only certain external storage platforms are currently queryable with a CMIS Query. The following CMIS Binding sources cannot be queried currently.
As such, these bindings are not suitable for the Import Query Results provider (You should instead use Import Descendants). Furthermore, you will not have access to the "Search Repository" tab for these bindings when selecting a CMIS Repository.
|
How to Access the CMIS Query Editor
Option 1: Import Query Results > CMIS Query
Most commonly, you will use a CMISQL query to import documents using the Import Query Results provider.
| FYI |
The Import Query Results provider is one of two Import Providers that use a CMIS Connection to bring document content into Grooper. If you need more information on Import Query Results, please visit the CMIS Import article. |
|
|
|
|
|
Option 2: The Search Repository Tab
You can also use the CMIS Query Editor when searching for documents in a CMIS Repository using the "Search Repository" tab.
|
|
We will demonstrate how to use this query editor in the next section
How to Use the CMIS Query Editor
Next, we're going to illustrate how to use the CMIS Query Editor in order to filter email messages based on certain criteria. The CMIS Query Editor's interface allows you to configure the CMISQL query based on available metadata to the CMIS Binding.
We will use the Exchange binding, which has a selection of queryable metadata for email messages, such as the email's subject and date the message was received. Our query will import email messages based on the following conditions:
- Only messages are to be imported (as opposed to other Exchange content, such as appointments or tasks).
- Only messages in a certain folder will be imported.
- Only messages containing specific text search terms will be imported.
- Only messages from a certain sender will be imported.
- Only messages that have not been read yet will be imported.
This will get us through the basics of building a CMISQL query using the CMIS Query Editor.
The SELECT and FROM ClausesThe property grid is divided into the various clauses that form the CMISQL query. Adjusting these properties will adjust the CMISQL query.
|
|
Changing Content Types in the FROM ClauseWe will use the CMIS Query Editor's property panel to edit our query going forward.
As we use the property grid to further configure our query, the query's text will update automatically.
|
The WHERE ClauseNext, we will use the Where property to configure the various filter conditions we've set for ourselves.
|
|
Using Predicate Collection for Multiple Search Conditions
|
Adding Search PredicatesNow that we're in the Where Predicate Expressions collection editor, we can start adding and configuring our search conditions.
|
|||
The Scope PredicateThe Scope Predicate allows you to choose a folder from which you want to retrieve documents.
|
|||
The Contains PredicateIf the storage location your CMIS Repository is connected to supports full text searching, you can use the Contains Predicate to query documents based on specific text search terms. For example, we want to retrieve email messages pertaining to "Wiki Vitals", a specific kind of report generated nightly about the Grooper wiki and sent to my email. To add a Contains Predicate:
|
|||
The NOT OperatorAll predicates can be logically negated using the The only way I know it's a monthly report is the term "monthly report" is listed in the body of the email. I can use the Contains Predicate and the To logically negate a predicate:
|
|||
The Comparison PredicateThe Comparison Predicate specifies search criteria based on a document's metadata property values, using comparison operators such as For example, we will use a Comparison Predicate to only import emails that have not been read yet, whose "IsRead" property value equals "False". To add a Comparison Predicate':
|
|||
The LIKE OperatorThe For example, if we want to use a Comparison Predicate to filter messages based on the "Sender" (a string value), we will need to use the
|
|||
| |||
Finish Adding Predicates
|
|||
|
|
|||
We could be done at this point. We have the three basic parts of a CMIS Query.
- The query starts with
SELECT * - The
FROMclause determines which content (documents/folders and their metadata properties) we're querying from the CMIS Repository/storage location. - The
WHEREclause defines the query's search conditions.
Optionally, you may want to order the documents Grooper retrieves. We will do that next, with the ORDER BY clause.
The ORDER BY Clause
The ORDER BY clause is an optional CMISQL querying clause used to change the sort order of the documents Grooper retrieves. You can sort by any metadata property available.
For example, we may want to sort by the date the email message was sent. And specifically we want the most recent email first in the list and the last email last in the list. We're going to need an ORDER BY clause to do this.
|
|
|
| |
Testing the Query
|
Unsupported Query Configurations
The NTFS Binding will not support full text search (the CONTAINS() predicate) unless the Windows file system has been indexed by the Windows Indexing Service.
The SharePoint Binding does not support the IN_FOLDER predicate when selecting a folder location to query. You must use the IN_TREE predicate instead.
- Be aware the
IN_TREEsearch is recursive whereIN_FOLDERis not. This means any subfolders will be queried as well as the targeted folder.
The SharePoint Binding has a hard limitation when it comes to querying larger repositories, with many thousands of folders and files. There is currently a threshold limit of 5000 rows. This means if a query has to scan more than 5000 items to return an item, SharePoint will return the following error:
- "Internal Server Error (500): The attempted operation is prohibited because it exceeds the list view threshold."

















