ROM-LDAP

lightweight directory object mapping for Ruby

pipeline status coverage report#ruby #ldap #data #rom-rb #opensource

Introduction

I am the developer of the ROM-LDAP Ruby gem, an LDAP adapter for ROM-RB. Currently the code base has over 500 commits1, a test suite built against four different LDAP vendors2 with 90% coverage and over 1000 specs per run.

This project is the result of code I started writing whilst working at LBU managing the digital studio3 in the Leeds School of Arts. What began as part of a Rails dashboard for monitoring my students' user accounts, at a time when I had to run my own LDAP server, became an attempt to make something I could reuse and share.

This library borrows heavily from the work of net-ldap, sequel and rom-sql and was intended as a vehicle for my own learning. What made me think a ROM adapter for LDAP would be a good candidate, is that not only is LDAP a common service in many corporate environments, but it also appears in many Ruby projects where authentication is required. I wanted a cleaner interface for interacting with a directory server and something that did not monkey-patch the core Ruby classes.

I had previously considered directories, like OpenLDAP, as just a source of user data; however, now I am using rom-ldap in ways I hadn’t anticipated. I am building applications with LDAP as a backend document store and writing custom schema to model and validate the data at the persistence layer.

If you are a researcher, and building taxonomies is part of your work, or if your Ruby project authenticates against an LDAP directory, or if you are an LDAP administrator who likes to script, then this project might be of interest. The examples below will touch on some rom-rb conventions and how those have been applied to this protocol.


Getting Started

By default a gateway will connect to localhost on the default port localhost:389.

ROM::Configuration.new(:ldap)

Alternatively, you can connect by providing an explicit URI4 which is a common method clients use.

ROM::Configuration.new(:ldap, 'ldap://cn=admin,dc=rom,dc=ldap:topsecret@openldap:1389/dc=rom,dc=ldap')

RFC4516 defines a URI as ldap://host:port/DN?attributes?scope?filter?extensions however rom-ldap uses a URI scheme like that of HTTP.

You can also use a socket to establish the connection, however the search base must be passed as an additional option.

ROM::Configuration.new(:ldap, 'ldap://cn=admin,dc=rom,dc=ldap:topsecret@/var/run/ldapi')

If an environment variable is more appropriate, you can connect using the LDAPURI variable.

ENV['LDAPURI'] = 'ldap://ldap-server:1389'

ROM::Configuration.new(:ldap)

Or a combination of the following LDAPHOST, LDAPPORT, LDAPBASE, LDAPBINDDN and LDAPBINDPW.

You can also override configuration parameters by providing them explicitly in the options hash.

ENV['LDAPURI'] = 'ldap://ldap-server/dc=rom,dc=ldap'

ROM::Configuration.new(:ldap,
  base: 'ou=dev,dc=rom,dc=ldap',
  username: 'developer',
  password: 'passwd'
)

Extensions

ROM-LDAP, like rom-sql and sequel, leverage extensions for added functionality. This example would connect to a directory defined using environment variables with an extension enabled in the options.

ROM::Configuration.new(:ldap, extensions: %i[compatibility])

The available extensions are: :compatibility, :dsml_export, :msgpack_export, :optimised_json

  1. Compatibility converts the attribute names of LDAP entries, commonly formatted using camelCase or kebab-case, into snake_case making them compatible Ruby method names.
  2. DSML5 Export adds #to_dsml to relations and requires libxml.
  3. MessagePack Export adds #to_msgpack to relations and requires msgpack.
  4. Optimised JSON uses the oj gem as a drop-in replacement for the #to_json method on relations.

An example of the :compatibility extension attribute name mapping used on an ApacheDS server.

relation.dataset.directory.key_map.take(5).to_h
# => {:a_record=>"aRecord",
#     :access_control_subentries=>"accessControlSubentries",
#     :administrative_role=>"administrativeRole",
#     :ads_allow_anonymous_access=>"ads-allowAnonymousAccess",
#     :ads_authenticator_class=>"ads-authenticatorClass"}

Tasks

ROM-LDAP includes some rake tasks for working with directories:

  • rom/ldap/tasks/ldap.rake a wrapper around ldapmodify (to batch process a folder of LDIF files) and ldapsearch.
  • rom/ldap/tasks/ldif.rake which has no external dependencies and leverages rom-ldap to import an LDIF file or print LDIF to the console.

Import an LDIF file using an authenticated socket and print server responses.

$ DEBUG=y LDAPURI=ldap://cn=admin,dc=rom,dc=ldap:topsecret@$PWD/tmp/openldap/ldapi rake ldif:import'[examples/ldif/users.ldif]'

The LDAPDIR variable can be used to assign the folder containing your LDIF modifications.

$ LDAPDIR=$PWD/examples/ldif rake ldap:modify

By default, with no argument, export queries the search base using (objectClass=*)

$ LDAPURI=ldap://cn=admin,dc=rom,dc=ldap:topsecret@localhost:2389/dc=rom,dc=ldap rake 'ldif:export[(cn=*)]' > entries.ldif

Attribute Types

Like other ROM adapters, rom-ldap will map directory entries, and can coerce attribute values from an array of strings into Ruby classes with only a marginal overhead compared to ruby-net-ldap.

Benchmarks are included in the repository.

config = ROM::Configuration.new(:ldap) do |c|
  c.relations(:users) do
    schema('(objectClass=person)', as: :users) do
      attribute :object_class, ROM::LDAP::Types::Strings, read: Types::Symbols
      attribute :uid_number,   ROM::LDAP::Types::Strings.meta(index: true), read: Types::Integer
    end

    auto_struct true
  end
end

The entries themselves are returned as an array of hashes whose values have been coerced. The adapter uses a dependency named ldap-ber, which is a library of refinements and a port of net-ldap encoding, which ensures all objects contain no monkey-patching and attributes are Ruby core classes.

Some of these code examples are from a project using rom-ldap for zoological taxonomy. This OpenLDAP schema definition is important as it ensures query methods like #order and #matches will work for this attribute.

    attributetype ( 1.3.6.1.4.1.18055.0.4.1.2.1001 NAME 'species'
        DESC 'The scientific name of the animal'
        EQUALITY caseIgnoreMatch
        ORDERING caseIgnoreOrderingMatch
        SUBSTR caseIgnoreSubstringsMatch
        SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
        SINGLE-VALUE )
animals.with(auto_struct: false).matches(cn: '熊').to_a
# => [{:dn=>["cn=Giant Panda,ou=animals,dc=rom,dc=ldap"],
#      :cn=>["Giant Panda", "Cat Bear", "猫熊", "Bear Cat", "熊猫"],
#      :endangered=>true,
#      :extinct=>false,
#      :family=>"Ursidae",
#      :genus=>"Ailuropoda",
#      :labeled_uri=>["https://en.wikipedia.org/wiki/Giant_panda"],
#      :object_class=>["top", "mammalia", "extensibleObject"],
#      :order=>"Carnivora",
#      :population_count=>50,
#      :species=>"Ailuropoda melanoleuca",
#      :study=>:mammalogy}]

Basic Encoding Rules

The BER used by rom-ldap are a rewrite of the code from the net-ldap gem, converted from core extensions into refinements, and packaged as a dependency in ldap-ber. By applying the separation of concerns principle and removing the monkey-patching I aimed to keep the code modular and isolated.

Filtering

In order to chain complex queries together rom-ldap converts the standard filter syntax used for LDAP queries into an AST6.

animals.by_pk('cn=Lion,ou=animals,dc=rom,dc=ldap')
# => #<ROM::Relation[Animals] name=ROM::Relation::Name(animals on (species=*)) dataset=#<ROM::LDAP::Dataset: base="dc=rom,dc=ldap" [:con_and, [[:op_eql, :species, :wildcard], [:op_eql, :entry_dn, "cn=Lion,ou=animals,dc=rom,dc=ldap"]]]>>

If we inspect a relation, we can see the dataset has combined the queries of the original search base and the chained query, into a nested array which is equivalent to (&(species=*)(entrydn=cn=Lion,ou=animals,dc=rom,dc=ldap)). As with relational databases, the primary key can be defined in the relation’s schema block just like rom-sql.

    [
      :con_and, [
        [:op_eql, :species, :wildcard],
        [:op_eql, :entry_dn, "cn=Lion,ou=animals,dc=rom,dc=ldap"]
      ]
    ]

DSL

The ROM::LDAP::Dataset defines a DSL7 that corresponds to the filter types used in LDAP queries. Here are just two examples copied from the class with their accompanying documentation. The method #present receives an attribute name and returns a presence filter encoded as an AST chained to a new dataset instance. The second method #gt uses the less than or equal to operator :op_lte and inverts it with the negation constructor :con_not.

# Presence filter aliased as 'has'.
#
# @example
#   relation.present(:uid)
#   relation.has(:mail)
#
# @param attribute [Symbol]
#
# @return [Dataset]
def present(attribute)
  chain(:op_eql, attribute, :wildcard)
end
alias_method :has, :present

# Greater than filter
#
# @param args [Hash]
#
# @return [Dataset]
def gt(args)
  chain(:con_not, [:op_lte, *args.to_a[0]])
end
alias_method :above, :gt

Query Interface

The built in query methods have a similar design to those of other query builders like Sequel and ActiveRecord. This means that if you already use an RDMS8 then ROM-LDAP will be very familiar.

The standard #where method.

animals.where(extinct: true).to_a
# => [#<Entities::Animal cn=["Dodo"] study=nil family="Columbidae" genus="Raphus" order="Columbiformes" species="Raphus cucullatus" description=nil discovery_date=1598-01-01 00:00:00 UTC dn=["cn=Dodo,ou=extinct,ou=animals,dc=rom,dc=ldap"] endangered=nil extinct=true labeled_uri=nil object_class=["top", "aves", "extensibleObject"] population_count=0>]

Passing a block to #where exposes attributes that you can call methods on, or you can use backticks to query a raw LDAP filter string.

animals.where { species.is 'homo sapiens' }.one.study
# => :anthropology

animals.where { population_count < 100 }.list(:species)
# => ["Hydrochoerus hydrochaeris",
#     "Pongo borneo",
#     "Testudo graeca",
#     "Dendrobates tinctorius",
#     "Phascolarctos cinereus",
#     "Ailuropoda melanoleuca",
#     "Equus zebra",
#     "Myobatrachus gouldii",
#     "Acerodon celebensis",
#     "Orcinus orca",
#     "Ornithorhynchus anatinus",
#     "Phoenicoparrus jamesi",
#     "Helarctos malayanus",
#     "Vulpes vulpes",
#     "Raphus cucullatus",
#     "Turdus migratorius"]

animals.where { `(cn=dodo)` }.count
# => 1

Within the block it is also possible to use some applicable ruby operators; this example searches for users whose first name does not sound like mine.

users.where { given_name !~ 'peter' }

Use #matches to perform fuzzy wildcard queries on the values passed. This means the attributes should have SUBSTR in their schema definition'

animals.matches(cn: 'hum').one
# => #<Entities::Animal cn=["Human"] study=:anthropology family="Hominidae" genus="Homo" order="Primates" species="Homo sapiens" description=["Modern humans are the only extant members of the subtribe Hominina, a branch of the tribe Hominini belonging to the family of great apes."] discovery_date=nil dn=["cn=Human,ou=animals,dc=rom,dc=ldap"] endangered=false extinct=false labeled_uri=["https://en.wikipedia.org/wiki/Human"] object_class=["top", "mammalia", "extensibleObject"] population_count=7582530942>

The #find method, aliases as #grep, allows you to query all the attributes defined in the schema with meta(grep: true).

relation.find('eo')

#pluck returns an array of attribute values from the entries in the relation.

users.pluck(:uidnumber)
# ["1", "2"]

users.pluck(:cn)
# [["Cat", "House Cat"], ["Mouse"]]

users.pluck(:gidnumber, :uid)
# [["1", "Jane"] ["2", "Joe"]]

You can also rename the attributes on demand using #select or #project.

users.select { cn.as(:user_name) }.one
# {:user_name => "Peter Hamilton"}

Relations

Ultimately, it means you can design relation classes with relevant and reusable methods…

def vegetarians
  unequal(order: 'carnivora')
end

def population_above(num)
  gte(population_count: num)
end

def population_below(num)
  where { population_count < num }
end

def detailed
  present(:description)
end

…and using the auto_restrictions plugin, with attributes identified in the schema, we can have these finder methods generated automatically.

schema do
  attribute :family,  ROM::LDAP::Types::String.meta(index: true)
  attribute :genus,   ROM::LDAP::Types::String.meta(index: true)
  attribute :order,   ROM::LDAP::Types::String.meta(index: true)
  attribute :species, ROM::LDAP::Types::String.meta(index: true)
end

use :auto_restrictions

def carnivores
  by_order('carnivora')
end

def great_apes
  by_family('hominidae')
end

def bears
  by_family('ursidae')
end

When you add, edit or remove entries the affected tuples9 are returned if successfully persisted.

relation.insert(
  dn: 'uid=batman,ou=comic,dc=rom,dc=ldap',
  cn: 'The Dark Knight',
  uid: 'batman',
  given_name: 'Bruce',
  sn: 'Wayne',
  apple_imhandle: 'bruce-wayne',
  object_class: %w[extensibleObject inetOrgPerson]
)

relation.has(:given_name).update(given_name: nil, sn: 'REDACTED')


relation.delete

ROM::LDAP::Relation has class methods base and branch to make defining and inheriting from parent classes easier.

base 'dc=rom,dc=ldap'

branches animals: 'ou=animals,dc=rom,dc=ldap',
         extinct: 'ou=extinct,ou=animals,dc=rom,dc=ldap'

Structs

We can define custom struct classes inheriting from ROM::Struct, and because LDAP entries behave like NoSQL10 key-value and wide-column documents, we can easily handle cases where attributes are not present by including the omittable transformation type.

module Entities
  class Animal < ROM::Struct
    transform_types(&:omittable)

    def common_name
      cn.first.upcase
    end
  end
end


animals.equal(cn: 'orangutan').one.cn
# => ["Orangutan"]

animals.equal(cn: 'orangutan').one.common_name
# => "ORANGUTAN"

Mappers

ROM::Transformer classes and #map_with work the same with our directory entries.

class TransformAnimal < ROM::Transformer
  relation    :animals
  register_as :classification

  map_array do
    rename_keys modify_timestamp: :updated_at,
                create_timestamp: :created_at

    nest :taxonomy, %i[species order family genus]
    nest :status,   %i[extinct endangered population_count]
    nest :info,     %i[labeled_uri description cn]
  end
end

animals.where(cn: 'megabat').map_with(:classification).to_a

# => [{:study=>:chiropterology,
#      :dn=>["cn=Sulawesi Fruit Bat,ou=animals,dc=rom,dc=ldap"],
#      :object_class=>["top", "mammalia", "extensibleObject"],
#      :taxonomy=>
#       {:species=>"Acerodon celebensis",
#        :order=>"Chiroptera",
#        :family=>"Pteropodidae",
#        :genus=>"Acerodon"},
#      :status=>{},
#      :info=>
#       {:labeled_uri=>["https://en.wikipedia.org/wiki/Sulawesi_flying_fox"],
#        :description=>
#         ["They are frugivores and rely on their keen senses of sight and smell to locate food."],
#        :cn=>["Megabat", "Sulawesi Fruit Bat", "Sulawesi Flying Fox"]}}]

Changesets

Because a DN11 is required for every entry, you could decide to use a ROM::Changeset here. This example uses the CN attribute to build the DN. However your LDAP server will add the RDN12 automatically if the DN is provided and this also works with multi-valued RDN.

class NewAnimal < ROM::Changeset::Create[:animals]
  map do |tuple|
    { dn: "cn=#{tuple[:cn]},ou=animals,dc=rom,dc=ldap", **tuple }
  end
end

people.insert(dn: 'cn=Captain America+sn=Rogers,ou=avengers,dc=marvel', objectclass: 'person')

Exporting

I covered the optional extensions earlier, but even out of the box, ROM-LDAP includes exporting to LDIF, JSON and YAML.

reptiles.to_yaml
# => "---\n- dn:\n  - cn=Panther Chameleon,ou=animals,dc=rom,dc=ldap\n  cn:\n  - Panther Chameleon\n  species:\n  - Furcifer pardalis\n- dn:\n  - cn=Leopard Gecko,ou=animals,dc=rom,dc=ldap\n  cn:\n  - Leopard Gecko\n  species:\n  - Eublepharis macularius\n- dn:\n  - cn=Spur-thighed Tortoise,ou=animals,dc=rom,dc=ldap\n  cn:\n  - Spur-thighed Tortoise\n  species:\n  - Testudo graeca\n"

apes.to_ldif
# => "dn: cn=Orangutan,ou=animals,dc=rom,dc=ldap\ncn: Orangutan\ndescription: The King of the Swingers\nextinct: FALSE\nfamily: Hominidae\ngenus: Pongo\nobjectClass: top\nobjectClass: mammalia\nobjectClass: extensibleObject\norder: Primates\npopulationCount: 0\nspecies: Pongo borneo\nstudy: primatology\n\ndn: cn=Human,ou=animals,dc=rom,dc=ldap\ncn: Human\ndescription: Modern humans are the only extant members of the subtribe Hominina, a branch of the tribe Hominini belonging to the family of great apes.\nendangered: FALSE\nextinct: FALSE\nfamily: Hominidae\ngenus: Homo\nlabeledURI: https://en.wikipedia.org/wiki/Human\nobjectClass: top\nobjectClass: mammalia\nobjectClass: extensibleObject\norder: Primates\npopulationCount: 7582530942\nspecies: Homo sapiens\nstudy: anthropology\n\n"

birds.to_json
# => "[{\"dn\":[\"cn=Dodo,ou=extinct,ou=animals,dc=rom,dc=ldap\"],\"cn\":[\"Dodo\"],\"species\":[\"Raphus cucullatus\"]},{\"dn\":[\"cn=James's Flamingo,ou=animals,dc=rom,dc=ldap\"],\"cn\":[\"James's Flamingo\"],\"species\":[\"Phoenicoparrus jamesi\"]},{\"dn\":[\"cn=American Robin,ou=animals,dc=rom,dc=ldap\"],\"cn\":[\"American Robin\"],\"species\":[\"Turdus migratorius\"]}]"

Roadmap

I am currently working on implementing atomic transactions against supported vendors and securing connections with SSL. Contributions and suggestions are very welcome and details are included in the repository.

Written by Peter Hamilton and photo by Markus Spiske


  1. I am currently the sole contributor to this project. ↩︎

  2. OpenLDAP, ApacheDS, OpenDJ and 389DS. ↩︎

  3. Affectionately dubbed the Mac Lab. ↩︎

  4. Uniform Resource Identifier describes how to find and access the information. ↩︎

  5. Directory Services Markup Language is a representation of directory service information in an XML syntax. ↩︎

  6. Abstract Syntax Trees are a tree like representation of structure and content. ↩︎

  7. Domain-Specific Language which in this case works like a markup language. ↩︎

  8. Relational Database Management Systems use a tabular schema and can suffer from object–relational impedance mismatch. ↩︎

  9. A finite ordered list of elements ↩︎

  10. Non-relational Structured Query Language databases for semi-structured data. ↩︎

  11. Distinguished Name uniquely identifies an entry and describes its position in the DIT13. ↩︎

  12. Relative Distinguished Names which make up the DN. ↩︎

  13. Directory Information Tree refers to the heirarchical structure of the entries. ↩︎