K is a proprietary array processing programming language developed by Arthur Whitney and commercialized by Kx Systems. The language serves as the foundation for kdb+, an in-memory, column-based database, and other related financial products. The language, originally developed in 1993, is a variant of APL and contains elements of Scheme. Advocates of the language emphasize its speed, facility in handling arrays, and expressive syntax.
History
Before developing K, Arthur Whitney had worked extensively with APL, first at I. P. Sharp Associates alongside Ken Iverson and Roger Hui, and later at Morgan Stanley developing financial applications. At Morgan Stanley, Whitney helped to develop A+, a variant of APL, to facilitate migrating APL applications from IBM mainframe computers to a network of Sun workstations. A+ had a smaller set of primitive functions and was designed for speed and to handle large sets of time series data. In 1993, Whitney left Morgan Stanley and developed the first version of the K language. At the same time he formed Kx Systems to commercialize the product and signed an exclusive contract with Union Bank of Switzerland. For the next four years he developed various financial and trading applications using K for UBS. The contract ended in 1997 when UBS merged with Swiss Bank. In 1998, Kx Systems released kdb+, a database built on K. kdb was an in-memory, column-oriented database and included ksql, a query language with an SQL-like syntax. Since then, several financial products have been developed with K and kdb+. kdb+/tick and kdb+/taq were developed in 2001. kdb+, a 64-bit version of kdb+ was released in 2003 and kdb+/tick and kdb+/taq were released in 2004. kdb+ included Q, a language that merged the functions of the underlying K language and ksql. Whitney released a derivative of K called Shakti in 2018.
Overview
K shares key features with APL. They are both interpreted, interactive languages noted for concise and expressive syntax. They have simple rules of precedence based on right to left evaluation. The languages contain a rich set of primitive functions designed for processing arrays. These primitive functions include mathematical operations that work on arrays as whole data objects, and array operations, such as sorting or reversing the order of an array. In addition, the language contains special operators that combine with primitive functions to perform types of iteration and recursion. As a result, complex and extended transformations of a dataset can be expressed as a chain of sub-expressions, with each link performing a segment of the calculation and passing the results to the next link in the chain. Like APL, the primitive functions and operators are represented by single or double characters; however, unlike APL, K restricts itself to the ASCII character set. To allow for this, the set of primitive functions for K is smaller and heavily overloaded, with each of the ASCII symbols representing two or more distinct functions or operations. In a given expression, the actual function referenced is determined by the context. As a result, K expressions can be opaque and difficult to parse for humans. For example, in the following contrived expression the exclamation point! refers to three distinct functions:
2!!7!4
Reading from right to left the first ! is modulo division that is performed on 7 and 4 resulting in 3. The next ! is enumeration and lists the integers less than 3, resulting in the list 0 1 2. The final ! is rotation where the list on the right is rotated two times to the left producing the final result of 2 0 1. The second core distinction of K is that functions are first-class objects, a concept borrowed from Scheme. First-class functions can be used in the same contexts where a data value can be used. Functions can be specified as anonymous expressions and used directly with other expressions. Function expressions are specified in K using curly brackets. For example, in the following expression a quadratic expression is defined as a function and applied to the values 0 1 2 and 3: '!4 In K, named functions are simply function expressions stored to a variable in the same way any data value is stored to a variable. a:25 f: Functions can be passed as an argument to another function or returned as a result from a function.
Examples
K is an interpreted language where every statement is evaluated and its results displayed immediately. Literal expressions such as strings evaluate to themselves. Consequently, the Hello world-program is trivial:
"Hello world!"
The following expression sorts a list of strings by their lengths: x@>#:'x
The expression is evaluated from right to left as follows:
2_ drops the first two elements of the enumeration.
x!/: performs modulo division between the original integer and each value in the truncated list.
&/ find the minimum value of the list of modulo result.
If x is not prime then one of the values returned by the modulo operation will be 0 and consequently the minimal value of the list. If x is prime then the minimal value will be 1, because x mod 2 is 1 for any prime greater than 2. The function below can be used to list all of the prime numbers between 1 and R with: 2_&'!R
The expression is evaluated from right to left
!R enumerate the integers less than R.
' apply each value of the enumeration to the prime number function on the left. This will return a list of 0's and 1's.
& return the indices of the list where the value is 1.
2_ drop the first two elements of the enumeration
K financial products
K is the foundation for a family of financial products. Kdb+ is an in-memory, column-based database with much of the same functions of a relational database management system. The database supports SQL, SQL-92 and ksql, a query language with a syntax similar to SQL and designed for column based queries and array analysis. Kdb+ is available for several operating systems, including Solaris, Linux, macOS, and Windows.