LINQ to Grooper Objects

From Grooper Wiki
Revision as of 10:44, 8 April 2021 by Dgreenwood (talk | contribs) (→‎Applying LINQ to Grooper Expressions)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Language Integrated Query (LINQ, pronounced "link") is a Microsoft .NET Framework component that adds native data querying capabilities to .NET languages, originally released as a major part of .NET Framework 3.5 in 2007.

Using Language-Integrated Query (LINQ) within Grooper Expressions

Expressions play an important role in Grooper, so users often ask how to add writing expressions to their skillset. The unfortunate truth is that writing expressions tends to be even more complicated than writing complete code in the form of scripts or applications. Basically, writing an expression is no different than writing a script, except for the fact that all the conditions, logic and output needs to squeeze into one “line” of code. When applicable, LINQ nets the same result while being more concise, readable, and overall, more user-friendly.

😎 Special thanks to BIS team member Brian Godwin for contributing this article!


What is a Query and What Does it Do?

A query is a set of instructions that describes what data to retrieve from a given data source (or sources) and what shape and organization the returned data should have. A query is distinct from the results that it produces. When using LINQ with Grooper, the data sources are in the form of object collections.

What is a Query Expression?

A query expression is a query expressed in query syntax. A query expression consists of a set of clauses written in a declarative syntax like SQL. Each clause in turn contains one or more expressions, and these expressions may themselves be either a query expression or contain a query expression.

A query expression must begin with a from clause and must end with a select or group clause. Between the first from clause and the last select or group clause, it can contain one or more of these optional clauses: where, orderby, join, let and even additional from clauses. You can also use the into keyword to enable the result of a join or group clause to serve as the source for additional query clauses in the same query expression.

A LINQ Query May do One of Three Things:

  1. Retrieve a subset of the elements to produce a new sequence without modifying the individual elements. The query may then sort or group the returned sequence in various ways, as shown in the following example (assume scores is an array of integers):
    from score in scores where score > 80 orderby score descending select score
  2. Retrieve a sequence of elements as in the previous example but transform them to a new type of object. For example, a query may retrieve only the last names from certain customer records in a data source. Or it may retrieve the complete record and then use it to construct another in-memory object type or even XML data before generating the result sequence. The following example shows a projection from an int to a string. Note the following expression returns a string of text:
    from score in scores where score > 80 orderby score descending select $"The score is {score}"
  3. Retrieve a singleton value about the source data, such as:
    1. The number of elements that match a certain condition.
    2. The element that has the greatest or least value.
    3. The first element that matches a condition, or the sum of values in a specified set of elements. For example, the following query returns the number of scores greater than 80 from the scores integer array:
    (from score in scores where score > 80 select score).Count()
    In the previous example, note the use of parentheses around the query expression before the call to the Count method.

How To

Starting a Query Expression

The from Clause

A query expression must begin with a from clause. It specifies a data source together with a range variable. The range variable represents each successive element in the source sequence as the source sequence is being traversed. The range variable is strongly typed based on the type of elements in the data source. In the following example, because countries is an array of Country objects, the range variable is also typed as Country. Because the range variable is strongly typed, you can use the dot operator to access any available members of the type.

from country in countries where country.Area > 500000 select country

A query expression may contain multiple from clauses. Use additional from clauses when each element in the source sequence is itself a collection or contains a collection. For example, assume that you have a collection of Country objects, each of which contains a collection of City objects named Cities. To query the City objects in each Country, use two from clauses as shown here:

from country in countries from city in country.Cities where city.Population > 10000 select city

Ending a Query Expression

A query expression must end with either a group clause or a select clause.

The group Clause

Use the group clause to produce a sequence of groups organized by a key that you specify. The key can be any data type. For example, the following query creates a sequence of groups that contains one or more Country objects and whose key is a char value.

from country in countries group country by country.Name(0)

The select Clause

Use the select clause to produce all other types of sequences. A simple select clause just produces a sequence of the same type of objects as the objects that are contained in the data source. In this example, the data source contains Country objects. The orderby clause just sorts the elements into a new order and the select clause produces a sequence of the reordered Country objects.

from country in countries orderby country.Area select country

The select clause can be used to transform source data into sequences of new types. This transformation is also named a projection. In the following example, the select clause projects a sequence of anonymous types which contains only a subset of the fields in the original element. Note that the new objects are initialized by using an object initializer.

from country in countries select new { Name = country.Name, Pop = country.Population }

Filtering, Ordering, and Joining

Between the starting from clause, and the ending select or group clause, all other clauses (where, join, orderby, from, let) are optional. Any of the optional clauses may be used zero times or multiple times in a query body.

The where Clause

Use the where clause to filter out elements from the source data based on one or more predicate expressions. The where clause in the following example has one predicate with two conditions.

from city in cities where city.Population < 200000 and city.Population > 100000 select city

The orderby Clause

Use the orderby clause to sort the results in either ascending or descending order. You can also specify secondary sort orders. The following example performs a primary sort on the country objects by using the Area property. It then performs a secondary sort by using the Population property.

  • Note: The ascending keyword is optional; it is the default sort order if no order is specified.
from country in countries orderby country.Area, country.Population descending select country

The join Clause

Use the join clause to associate and/or combine elements from one data source with elements from another data source based on an equality comparison between specified keys in each element. In LINQ, join operations are performed on sequences of objects whose elements are different types. After you have joined two sequences, you must use a select or group statement to specify which element to store in the output sequence. You can also use an anonymous type to combine properties from each set of associated elements into a new type for the output sequence. The following example associates prod objects whose Category property matches one of the categories in the categories string array. Products whose Category does not match any string in categories are filtered out. The select statement projects a new type whose properties are taken from both cat and prod.

  • Note: You can also perform a group join by storing the results of the join operation into a temporary variable by using the into keyword.
from cat in categories join prod in products on cat equals prod.Category select new { Category = cat, Name = prod.Name }

Subqueries in a Query Expression

A query clause may itself contain a query expression, which is sometimes referred to as a subquery. Each subquery starts with its own from clause that does not necessarily point to the same data source in the first from clause. For example, the following query shows a query expression that is used in the select statement to retrieve the results of a grouping operation.

from student in students group student by student.GradeLevel into studentGroup select new { Level = studentGroup.Key, HighestScore = (from student2 in studentGroup select student2.Scores.Average()).Max() }

Applying LINQ to Grooper Expressions

The aspect of Grooper that makes LINQ so useful is that Grooper’s hierarchical structure relies on object collections. At a high level, imagine a batch as a collection of folders or pages, a batch process as a collection of process steps, and a data model as a collection of data elements in the form of sections, tables, fields, etc. It is not surprising that traversing these collections within an expression can be confusing, but LINQ simplifies interacting with these Grooper collections.

LINQ Example - Data Tables

For the following example, “Invoice Detail” is a Grooper Data Table with six Data Columns (Quantity, Item Id, Item Description, Item Units, Unit Price, Total Price). These could be written as calculate or validate expressions:

  1. To populate the number (count) of providers in the “Insurance Info” section
    (from provider in Demographic_Data.Insurance_Info select provider).Count()
  2. To populate a Boolean (true/false) field denoting whether any provider is BlueCross BlueShield
    (from provider in Demographic_Data.Insurance_Info where provider.Provider_Name.StartsWith("BCBS") select provider).Any()
  3. To populate a Boolean (true/false) field denoting whether any provider is outside Oklahoma
    (from provider in Demographic_Data.Insurance_Info where provider.Provider_State <> "OK" select provider).Any()

LINQ Example - Multi-instance Data Section

For the following example, "Invoice Number" is a Data Field in a multi-instance Data Section named "Invoice Section" (As in there is more than one section repeated through the same document. i.e. There will be multiple "Invoice Number" Data Field results collected, one for every section).

String.Join("|", From sec In Invoice_Section Select sec.Invoice_Number)
  • Concatenates all values from the field "Invoice Number" in the multi-instance section "Invoice Section", with pipe delimiter ("|").

LINQ Example - Batch Process

For the following example, a Should Submit expression is used to send any batches (or potentially folders) through an attended Review activity if it contains any “Generic” documents. This Should Submit expression would be written on the Review step. Note the DirectCast method which sets Item to its specific type (Batch Folder) which provides properties and methods available to objects of BatchFolder.

(from folder in DirectCast(Item, BatchFolder).Folders where folder.ContentTypeName = "Generic" select folder).Any()

Version Differences

Before 'Grooper' 2.9 LINQ expressions were not available.