Rittman Mead Consulting

Subscribe to Rittman Mead Consulting feed Rittman Mead Consulting
Rittman Mead consults, trains, and innovates within the world of Oracle Business Intelligence, data integration, and analytics.
Updated: 2 weeks 1 day ago

Oracle’s New Data Visualization Desktop

Tue, 2016-05-03 05:00

A recent addition to the Oracle lineup of visualization tools is the Oracle Data Visualization Desktop. Described by Oracle as a “single user desktop application that provides Oracle Data Visualization functionality to business users,” DVD is an easy-to-install data visualization tool for Windows 7, 8 or 10 that packs some very powerful features.

I recently had a chance to sit down and explore DVD and wanted to share some of my first impressions.

At its core, DVD is a stand-alone version of Oracle’s DVCS. If you are at all familiar with Visual Analyzer, you will feel right at home.

Screen Shot 2016-04-29 at 9.34.25 AM

Installation was a breeze on my Windows 10 VM and only took about 5 minutes and required no additional software or plugins for the standard VA functionality.

After installation, launching DVD is as easy as clicking on the desktop icon like any other stand-alone application.

Screen Shot 2016-04-29 at 2.05.26 PM

Screen Shot 2016-04-29 at 10.02.43 AM

After the ODV startup, I was presented with a home screen which contains a search field for finding projects, a list of user folders and a main window to select individual visualizations that have been created.

Screen Shot 2016-05-02 at 11.15.59 AM

Clicking on the hamburger Screen Shot 2016-04-29 at 2.24.46 PM icon in the top left corner brings up a menu where I can choose to start Visual Analyzer with the last data source selected, select new Data Sources or create a new VA Project.

Screen Shot 2016-04-29 at 2.02.58 PM

I chose to create a new VA project and selected the sample data from Oracle (the sample data is an optional install chosen during the DVD install process). Creating a dashboard was a fairly straightforward process. Following Visual Analyzer’s functionality of dragging and dropping columns, I was able to put together a simple sales and profit dashboard in a few minutes.

Screen Shot 2016-04-29 at 2.39.44 PM

While creating my dashboard, I noticed that Oracle has included some new visualization types. You can now choose Scatter (Cat.), Stacked Scatter (Cat.), Donut or Sunburst visualizations.

Screen Shot 2016-04-29 at 1.50.12 PM

Screen Shot 2016-04-29 at 10.15.46 AM

One other feature that Oracle added to DVD is the ability to insert images onto the dashboards. You can choose to upload your own image or link to a URL to pull images from the web.

Screen Shot 2016-05-02 at 8.21.19 AM

Screen Shot 2016-05-02 at 8.25.25 AM

I uploaded an image and changed the canvas layout to freeform, which allowed me to move the image anywhere on the dashboard. By adjusting the transparency it is possible to have the image underlay the entire dashboard and still be able to see the visualizations. This example is pretty extreme, and in a real world scenario, caution should be used as to not obstruct the visualizations.

Screen Shot 2016-05-02 at 8.33.38 AM

Next I decided to try to connect to my Oracle 12c sample database to pull in some new data to work with. Selecting “Create New Datasource” from the menu prompted me with three options: create from a file, from an existing app or from a database.

Screen Shot 2016-04-29 at 11.24.23 AM

Clicking on the “From Database” option, I was presented with a connection screen.

Screen Shot 2016-05-01 at 1.15.42 PM

On this screen I discovered one of the most impressive things about DVD. Clicking on “Database Type” reveals a dropdown menu which you can choose from a variety of database formats, including Spark, Hive and Mongo DB, among others.
Screen Shot 2016-05-01 at 1.16.15 PM

That’s awesome.

Because I already had 12c DB installed, I selected the Oracle Database Type and entered all my connection information.

Once a connection to the database is made, it shows up in the available connections list. Clicking on my sample database brought up a list of available schemas to choose from. In this case, I chose the sample HR schema which then brings up a list of tables available to add as data sources for visualizations.

Screen Shot 2016-04-29 at 3.11.22 PM

Screen Shot 2016-04-29 at 3.04.32 PM

I chose to add EMPLOYEES, JOBS and LOCATIONS and then started a new VA project. The HR tables now show up in the list of available data sources.

Screen Shot 2016-04-29 at 3.17.22 PM

I selected EMPLOYEES and JOBS and, within seconds, was able to create a simple table showing a list of employee names, their job titles, salaries and commission percentages.

Screen Shot 2016-04-29 at 11.35.09 AM

As you can see, adding new data sources is quick and easy and allows users to explore their data and create meaningful visualizations from that data in a very short amount of time.

Another feature is the Advanced Analytics portion of Oracle Data Visualization Desktop. This feature, which uses R, gives users the ability to do things like highlight outliers or show trend lines with a click of a button.

Screen Shot 2016-04-29 at 3.43.43 PM

Screen Shot 2016-04-29 at 3.48.50 PM

This feature does require an optional install located within the DVD application folder. The install process proved once again to be very quick and easy and completed in about 5 minutes.

Screen Shot 2016-04-29 at 10.02.23 AM

Screen Shot 2016-04-29 at 9.58.38 AM

After the installation was complete, I created a new VA project. Choosing the sample data provided by Oracle for DVD, I created a quick scatter chart and then, by right clicking anywhere on the visualization, clicked “Add Outliers.”

Screen Shot 2016-05-02 at 9.29.19 AM

Screen Shot 2016-05-02 at 9.29.39 AM

As you can see, outliers and non-outliers are easily distinguishable by the color key that DVD assigned automatically.

Next, I wanted to see how if I could change some of the colors in my visualization. DVD allows you to do this under the visualization menu.

Screen Shot 2016-05-02 at 9.35.17 AM

Screen Shot 2016-05-02 at 9.38.43 AM

Screen Shot 2016-05-02 at 9.39.33 AM

As with OBIEE, entering specific hex values is supported as well as selecting from pre-made color pallets is possible with DVD.

Using the same right-click functionality that I used for adding outliers, I was able to additionally add a polynomial trend line to show a gains and losses.

Screen Shot 2016-05-02 at 9.43.58 AM

Next, I decided to see if I could export this data and import it into Excel. Choosing export from the visualization menu, I was able to easily export the data as a .CSV and upload it into Excel.

Screen Shot 2016-05-02 at 9.54.35 AM

Overall, Oracle Data Visualization Desktop is a very impressive new addition to the to the DVCS lineup. The ability to collect data from multiple sources, its native adaptors for a variety of popular databases, and the ability to manipulate visualizations to convey the data in creative ways make it a strong contender against Tableau and Wave. It requires no remote server infrastructure and is a solid business solution for users Oracle Data Visualization functionality in a small and easily accessible package.

I feel as though I have just cracked the surface of everything this tool can do. Check back for future blogs and articles as we at Rittman Mead continue to explore the possibilities of DVD. The future of data visualization may be closer than we think.

If you would like more information about Visual Analyzer or the Oracle Cloud Service, see this blog post by Mark Rittman.

If you would like to watch the official Tech Demo of DVD, you can find it here.

Rittman Mead also offers in depth professional training courses for OBIEE 12c and Visual Analyzer.

The post Oracle’s New Data Visualization Desktop appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Experiments with Elastic’s Graph Tool

Fri, 2016-04-29 08:17

Elastic announced their Graph tool at ElastiCON 2016 (see presentation here). It’s part of the forthcoming X-Pack which bundles Graph along with other helper tools such as Shield and Marvel. Graph itself is two things; an extension of Elasticsearch’s capabilities, enabling the user to explore how items indexed in Elasticsearch are related, and a plugin for Kibana that acts as an optional front-end for this new functionality.

You can find a good introduction to Graph and the purpose and theory behind it in the documentation here. The installation of the components themselves is simple and documented here.

First Graph

To use Graph, you just point it at your existing data in Elasticsearch. The first data set I’m going to explore is one of the standard ones that everyone uses; Twitter. I’m streaming it in through Logstash (via Kafka for flexibility), but if you wanted you could ship it in via JDBC from any RDBMS, or from HDFS too.
See an important note at the end of this article about the slice of data within it, because it affects how the relationships visualised here should be viewed. 

On launching Kibana’s Graph plugin (http://localhost:5601/app/graph) I choose the index (note that index patterns, e.g. when partitioning by date, are not supported yet), and the field in the data that I want to use as my vertices. A point to note here – “vertices” are usually called “nodes” in Graph terminology, but since Elasticsearch already uses “nodes” as part of its infrastructure topology terminology, they had to pick a different term.

In the search box, I can put my search term from which I’m interested to see the related ‘vertices’.

Sounds baffling? It is, kinda – right up until you run it (hit enter from the search box or click the magnifying glass search icon) and see what happens:

Here we’re seeing the hashtags used in tweets that mention Kibana. The “connections” (Elastic term) or “edges” (general Graph term) show which vertices (nodes) are related, and the width indicates the strength of that relationship (based on Elasticsearch’s significant terms and scoring algorithm). For more details, see the “Behind the Scenes” section towards the end of this article.

We can add in a second set of vertices by running a second search (“Elasticsearch”) – the results for these are, in effect, appended to the existing ones:

Add Links

Since we’ve pulled back an additional set of vertices, it could be that there’s overlap between these and the first set (you’d kinda of expect it, Elasticsearch and Kibana being related). To visualise this, use the Add Links button

Note how the graph redraws itself with additional connections:

Blinked and you missed it? Use the Undo button to step back, and Redo button to re-apply.

Grouping Vertices

If you look closely at the graph you’ll see that Elasticsearch, ElasticSearch, and elasticsearch are all there as separate vertices. This is because I’m using a non-analyzed index field, so the strings are treated literally, case included. In this specific example, we’d probably re-run the graph using the analysed version of the field, which following the same two searches as above gives this:

But, sticking with our non-analysed example, we can use it to demonstrate Graph’s ability to group multiple terms together into a single vertex. Switch to Advanced Mode:

and then select the three vertices and click the group option

Now all three, and their connections, are as one:

Whilst the above analysed/non-analysed difference gave me excuse to show the group function (can you tell I’ve done many-a-failed-live-demo? ;-) ), I’m now going to switch over to a graph built on the analysed version of the hashtag field, as we saw briefly above:

Tidying up the Graph – Delete and Blacklist

There’s a few straglers on the Graph that are making it less easy to comprehend. We can temporarily remove them, or even blacklist them from appearing again in this session:

Expand Selection

One of the points of Graph analysis is visualising the relationships in your data in a way that standard relational methods may not lend themselves to so easily. We can now start to explore this further, by digging into the Graph that we’ve got so far. This process, along with the add links seen above, is often called “spidering“. By selecting the elasticsearch node and clicking on Expand selection we can see additional (by default, five) vertices related to this one:

So we see that kafka is related to Elasticsearch (in the view of the twitterati, at least), and let’s expand that Kafka vertex too:

By clicking the Expand selection button again for the same vertex we get further results added:

We can select one node (e.g. realtime) an using the Add Link see additional relationships:

But, there are many nodes, and we want to see any relationships. So, switch to Advanced Mode, select All

…add Add Link again:

Knob Twiddling

Let’s start with a blank canvas, in basic mode, showing hashtags related to … me (@rmoff)!

But, surely I do more than talk about OBIEE and ODI? Like, Elasticsearch? Let’s relax the Graph selection criteria, under Settings:

and run the search again (on top of the existing results):

There’s more results … but I know how much I tweet and it feels like I’m only seeing a part of the picture. By switching over to Advanced Mode, we can refine how many results each field returns:

I reset the workspace (undo to blank, or just reload), and run the search again, this time with a greater number of hashtag field values shown, and with the same relaxed search settings as shown above:

At this point I’m into “fiddling” territory, twiddling with the ‘Number of terms’, ‘Significant’ and ‘Certainty’ knobs to see how the results vary. You can read more about the algorithm behind the Significance setting here, and more about the Graph API here. The certainty setting is simply “The min number of documents that are required as evidence before introducing a related term”, so by lowering it we see more links, but potentially with more “noise” too, of terms that aren’t really related.

An important point to note here is the dataset that I’m using is already biased because of the terms I’m including in my twitter feed search, therefore I’d expect to see this skew in the results below. See the section at the end of this article for more details of the dataset.

  • 50 terms, significant unticked, certainty 1 (as above)
  • 50 terms, significant ticked, certainty 1
  • 50 terms, significant ticked, certainty 3
  • 20 terms, significant ticked, certainty 1
  • 20 terms, significant unticked, certainty 1

Based on the above, “Significant” seems to reduce the number of relationships discovered, but increase the level of weight shown in those that are there.

Adding Additional Vertex Fields

So we’ve seen a basic overview of how to generate Graphs, expand selections, and add relationships to those additional selections. Let’s look now at how multiple fields can be added to a Graph.

Starting with a blank workspace, I switched to Advanced Mode and added two fields from my twitter data:

  • user.screen_name
  • in_reply_to_screen_name

Note that you can customise the colour and icon of different fields.

Under Options I’ve left Significant Links enabled, and set Certainty to 1.

Let’s see who’s been interacting about the recent E4 summit:

Whilst it looks like Mark Rittman is the centre of everything, this is actually highlighting a skew in the source dataset – which includes everything Mark tweets but not all tweets about E4. See the section at the end of this article for more details of the dataset.

The lower cluster is Mark as the addressee of tweets (i.e. he is the in_reply_to_screen_name), whilst the upper cluster is tweets that Mark has sent addressing others (i.e. he is the user.screen_name).

If we click on Add Links a couple of times we can see that there’s other connections here – for example, Mark replies to Stewart (@stewartbryson), who Christian Berg (@Nephentur) talks to, who in turn talks to Mark.

This being twitter and the age of narcissism, I’ll click on my vertex and click Expand Selection to see the people who in turn talk to me:

And by using Add Link see how they relate to those already shown in the Graph:

Viewing Associated Records

Within Graph there’s the option to view the data associated with one or more vertices. We do this by selecting a vertex and clicking on View Example Docs (in Elasticsearch parlance, a document is akin to a ‘row’ as traditional RDBMS folk would know it). From here select the field – for twitter the text field has the contents of the tweet:

Adding Even more Vertex Fields

So, we’ve got a bit of a picture of who talks to whom, but can we see what they’re talking about? We could use the text field shown above to see the contents of tweets but that’s down in the weeds of individual tweets – we want to step back a notch and get a summarised view.

First I add in the hashtag field:

And then deselect the two username fields. This is so that I can expand existing vertices, and instead of showing related hashtags and users, instead I only expand it to show hashtags – and not additional users.

Now I select Mark as the orinator of a tweet, and Expand Selection followed by Add Links on all vertices until I get this:

The number of values selected is key in getting a representative Graph. Above I used a value of 10. Compare that to instead running the same process but with 50. Under Options I’ve left Significant Links enabled, and set Certainty to 1:

One interesting point we can see from this is that the user “itknowingness” in the cluster on the left seems to use all the hashtags, but doesn’t interact with anyone – from the Graph it’s easy to see, and a great example of where Graph gives you the answer to a question you didn’t necessarily know that you had, and which to get the answer out through a traditional RDBMS query would need a very specific query to do so. Looking at the source data via Kibana’s Discover panel shows that it is indeed a bot auto-retweeting anything and everything:

Building a Graph from Scratch

Now that we’ve seen all the salient functions, let’s start with a blank canvas, and see where we get.
The setttings I’m using are:

  • Significant Links unticked
  • Certainty = 1
  • Field entities.hashtags.text.analyzed max terms = 10
  • Field user.screen_name max terms = 10
  • Initial search term rmoff

Then I click on markrittman and Expand Selection, the same for mrainey, and also for the two hashtags e4 and hadoop:

Within the clusters, let’s see what links exist. With no vertices select I click on Add Links (which seems to be the same as selecting all vertices and doing the same). With each click additional links are added, all related to the hadoop/bigdata area:

I’m interested now in the E4 region of the Graph, and the vertices related to Mark Rittman. Clicking on his vertex and clicking “Select Neighbours” does exactly that:

Now I’m more interested in digging into the terms (hashtags) that are related that people, so I deselect the user.screen_name field, and then Expand Selection and Add Links again.

Note the width of the connections – a strong relationship between Mark Rittman, “Hadoop” and “SQL”, which is presumably from the tweets around the presentation he did recently on the subject of… SQL on Hadoop. Other terms, including Hive and Impala, are also related, as you’d expect.

Graphing Tweet Text Contents

By making sure that the tweet text is available as an analysed field we can produce a Graph based on the ‘tokens’ within the tweet, rather than the literal 140 characters. Whilst hashtags are there deliberately to help with the classification and grouping of tweets (so that other people can follow conversations on the same subject) there are two reasons why you’d want to look at the tweet text too:

  1. Not everyone uses hashtags
  2. Not all relationships are as boolean as a hashtag or not – maybe a general discussion in an area re-uses the same words which overall forms a relationship between the terms.

Here I’m going back to the default settings:

  • Significant Links ticked
  • Certainty = 3

And returning two fields – hashtag and tweet text

  • Field entities.hashtags.text.analyzed max terms = 20
  • Field text.analyzed max terms = 50
  • Initial search term kafka

I then tidy it up a bit :

  • Joining the same/near-same text and hashtags, such as “kafkasummit” hashtag and the same text. If you think about the contents of a tweet, hashtags are part of the text, therefore, there’s going to be a lot of this duplication.
  • Blacklisted text terms that are URL snippets. Here I’m using the Example Docs function to check the context of the term in the whole text field

    I also blacklisted common words (“the”, “of”, etc), and foreign ones (how British…).

Behind the Scenes

The Kibana Graph plugin is just a front-end for the Graph extension in Elasticsearch. It’s useful (and fun!) for exploring data, but in practice you’d be making direct REST API calls into Elasticsearch to retrieve a list of vertices and connections and relative weights for use in your application. You can see details of this from the Settings page and Last Request option

Looking at an example (the one used in the first example on this article), the request is pretty simple:

    "query": {
        "query_string": {
            "default_field": "_all",
            "query": "kibana"
    "controls": {
        "use_significance": true,
        "sample_size": 2000,
        "timeout": 5000
    "connections": {
        "vertices": [
                "field": "entities.hashtags.text.analyzed",
                "size": 5,
                "min_doc_count": 3
    "vertices": [
            "field": "entities.hashtags.text.analyzed",
            "size": 5,
            "min_doc_count": 3

and the response not too complex either, just long.

    "took": 201,
    "timed_out": false,
    "failures": [],
    "vertices": [
            "field": "entities.hashtags.text.analyzed",
            "term": "logstash",
            "weight": 0.1374238061561338,
            "depth": 0
            "field": "entities.hashtags.text.analyzed",
            "term": "timelion",
            "weight": 0.12719678206002483,
            "depth": 0
            "field": "entities.hashtags.text.analyzed",
            "term": "elasticsearch",
            "weight": 0.11733085557405047,
            "depth": 0
            "field": "entities.hashtags.text.analyzed",
            "term": "osdc",
            "weight": 0.00759026383038536,
            "depth": 1
            "field": "entities.hashtags.text.analyzed",
            "term": "letsencrypt",
            "weight": 0.006869972953128271,
            "depth": 1
            "field": "entities.hashtags.text.analyzed",
            "term": "kibana",
            "weight": 0.6699955212823048,
            "depth": 0
            "field": "entities.hashtags.text.analyzed",
            "term": "filebeat",
            "weight": 0.004700657388257993,
            "depth": 1
            "field": "entities.hashtags.text.analyzed",
            "term": "elk",
            "weight": 0.09717015256984456,
            "depth": 0
            "field": "entities.hashtags.text.analyzed",
            "term": "justsayin",
            "weight": 0.005724977460940227,
            "depth": 1
            "field": "entities.hashtags.text.analyzed",
            "term": "elasticsearch5",
            "weight": 0.004700657388257993,
            "depth": 1
    "connections": [
            "source": 0,
            "target": 3,
            "weight": 0.00759026383038536,
            "doc_count": 26
            "source": 7,
            "target": 5,
            "weight": 0.02004197094823259,
            "doc_count": 26
            "source": 5,
            "target": 4,
            "weight": 0.006869972953128271,
            "doc_count": 6
            "source": 5,
            "target": 0,
            "weight": 0.018289612748107368,
            "doc_count": 48
            "source": 0,
            "target": 6,
            "weight": 0.004700657388257993,
            "doc_count": 11
            "source": 7,
            "target": 0,
            "weight": 0.0038135609650491726,
            "doc_count": 10
            "source": 0,
            "target": 5,
            "weight": 0.0052711254217388415,
            "doc_count": 48
            "source": 0,
            "target": 9,
            "weight": 0.004700657388257993,
            "doc_count": 11
            "source": 5,
            "target": 1,
            "weight": 0.033204869273453314,
            "doc_count": 29
            "source": 1,
            "target": 5,
            "weight": 0.04492364819068228,
            "doc_count": 29
            "source": 5,
            "target": 8,
            "weight": 0.005724977460940227,
            "doc_count": 5
            "source": 2,
            "target": 5,
            "weight": 0.00015519515214322833,
            "doc_count": 80
            "source": 5,
            "target": 7,
            "weight": 0.022734810798933344,
            "doc_count": 26
            "source": 7,
            "target": 2,
            "weight": 0.0006823241440183544,
            "doc_count": 13

Note how the connections are described using the relative (zero-based) instance number of the vertices. You can also see that the width of a connection is based on the weight (calculated from the significant terms algorithm), rather than document count. Compare the connection width of timelion/kibana (vertices 1 and 5 respectively), with a weighting of 0.33 (kibana -> timelion) and 0.045 (timelion -> kibana) but overlapping document count of 29:

with elasticsearch -> kibana that has an overlapping document count of 80 but only a weight of 0.0001.

Elasticsearch’s documentation describes the significant terms algorithm thus, using the example of suggesting “H5N1” when users search for “bird flu” in text:

In all these cases the terms being selected are not simply the most popular terms in a set. They are the terms that have undergone a significant change in popularity measured between a foreground and background set. If the term “H5N1” only exists in 5 documents in a 10 million document index and yet is found in 4 of the 100 documents that make up a user’s search results that is significant and probably very relevant to their search. 5/10,000,000 vs 4/100 is a big swing in frequency.

So from this, we can roughly say that Graph is looking at the number of documents in which timelion is mentioned as a proportion of the whole dataset, and then in the number of documents in which the hashtag Kibana exists and also timelion is mentioned. Since the former is a plugin of the latter, the close relationship would be expected. You can use Kibana to explore the significant terms concept further – for example, taking the same ‘seed’ as the original Graph query above, Kibana, gives a similar set of results as the Graph:

More information about the scoring can be found here, which includes the fact that the scoring is, in part, based on TF-IDF (Term Frequency-Inverse Document Frequency).


Graph requires a licence – see here for details.


This tool is a great way to dip one’s toe into the waters of Graph analysis and visualisation. It’s another approach to consider in the data discovery phase of your analytics work, when you don’t even know the questions that you’ve got for the data in front of you. Your data can remain in Elasticsearch in the same format it’s always been, and the Graph function just runs on top of it.

I’ll not profess to be a Graph theory expert, so can’t pass much comment on the theoretical rigour of the results and techniques seen. One thing that struck me with it was that there’s no (apparent) way to manually influence the weight of connections and vertices – for example, based on the number of followers someone has one twitter consider them more (or less) relevant when determining relationships.

For a well-informed view on Graph theory and Social Network Analysis (SNA), see Jordan Meyer’s presentation here (and associated R code), as well as Mark Rittman’s presentation from BIWA this year.

Footnote: The Twitter Dataset

The dataset I’m using is a live stream from Twitter, via Logstash and Kafka, searching for a set of terms related to me and the field I work in. Therefore, there’s going to be a bunch of relationships missing (if I’ve not included the relevant term in my tweet search), and relationships over-stated (because as a proportion of all the records the terms I’ve selected will dominate).
An interesting use of Graph (or Elasticsearch’s significant terms aggregation in general) could be to identify all the relevant terms that I should be including in my twitter search, by sampling an ‘unpolluted’ feed for relationships. For example, if I’m interested in capturing Kafka tweets, perhaps I should also be capturing those related to Samza, Spark, and so on.

The post Experiments with Elastic’s Graph Tool appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Contemplating Upgrading to OBIEE 12c?

Thu, 2016-04-28 05:00
Where You Are Now

OBIEE 12c has been out for some time, and it seems like most folks are delaying upgrading to OBIEE 12c until the very last minute. Or at least until Oracle decides to put out another major version change of OBIEE, which is understandable. You’ve already spent time and money and devoted hundreds of resource hours to system monitoring, maintenance, testing, and development. Maybe you’ve invested in staff training to try to maximize your ROI in your existing OBIEE purchase. And now, after all this time and effort, you and your team have finally gotten things just right. Your BI engine is humming along, user adoption and stickiness are up, and you don’t have a lot of dead objects clogging up the Web Catalog. Your report hacks and work-arounds have been worked and reworked to become sustainable and maintainable business solutions. Everyone is getting what they want.

Sure, this scenario is part fantasy, but it doesn’t mean that as a BI team lead or member, you’re not always working toward this end. It would be nice to think that the people designing the tools with which we do this work understood the daily challenges and processes we must undergo in order to maintain the precarious homeostasis of our BI ecosystems. That’s where Rittman Mead comes in. If you’re considering upgrading to OBIEE 12c, or are even curious, keep reading. We’re here to help.

So Why Upgrade

Let’s get right down to it. Shoot over here and here to check out what our very own Mark Rittman had to say about the good, the bad, and the ugly of 12c. Our Silvia Rauton did a piece on lots of the nuts and bolts of 12c’s new front-end features. They’re all worth a read. Upgrading to OBIEE 12c offers many exciting new features that shouldn’t be ignored.

Heat Map

How Rittman Mead Can Help

We understand what it is to be presented with so many project challenges. Do you really want to risk the potential perils and pitfalls presented by upgrading to OBIEE 12c? We work both harder and smarter to make this stuff look good. And we get the most out of strategy and delivery via a number of in-house tools designed to keep your OBIEE deployment in tip top shape.

Maybe you want to make sure all your Catalog and RPD content gets ported over without issue? Instead of spending hours on testing every dashboard, report, and other catalog content post-migration, we’ve got the Automated Regression Testing package in our tool belt. We deploy this series of proprietary scripts and dashboards to ensure that everything will work just the way it was, if not better, from one version to the next.

Maybe you’d like to make sure your system will fire on all cylinders or you’d like to proactively monitor your OBIEE implementation. For that we’ve got the Performance Analytics Dashboards, built on the open source ELK stack to give you live, active monitoring of critical BI system stats and the underlying database and OS.

OBIEE Performance Analytics

On top of these tools, we’ve got the strategies and processes in place to not only guarantee the success of your upgrade, but to ensure that you and your team remain active and involved in the process.

What to Expect

You might be wondering what kinds of issues you can expect to experience during upgrading to OBIEE 12c (which is to say, nothing’s going to break, right?). Are you going to have to go through a big training curve? Does upgrading to OBIEE 12c mean you’re going to experience considerable resource downtime as your team, or an even an outside company, manages this process? To answer this question, I’m reminded of a quote from the movie Fight Club: “Choose your level of involvement.”

While we always prefer to work alongside your BI or IT team to facilitate the upgrade process, we also know that resource time is valuable and that your crew can’t stop what they’re doing until things wraps up. We often find that the more clients are engaged with the process, however, the easier the hand-off is because clients better understand best practices, and IT and BI teams are more empowered for the future.

Learning More about OBIEE 12c

But if you’re like many organizations, maybe you have to stay more hands off and get training after the upgrade is complete. Check out the link here to look over the agenda of our OBIEE 12c Bootcamp training course. Like our hugely popular 11g course, this program is five days of back-to-front instruction taught via a selection of seminars and hands-on labs, designed to impart most everything your team will need to know to continue or begin their successful BI practice.

What we often find is that, in addition to being a thorough and informative course, the Bootcamp is a great way to bring together teams or team members, often dispersed among different offices, under one roof to gain common understanding about how each person plays an important role as a member of the BI process. Whether they handle the ETL, data modeling, or report development, everyone can benefit from what often evolves from a training session into some impromptu team building.

Feel Empowered

If you’re still on the fence about whether or not to upgrade, as I said before, you’re not alone. There are lots of things you need to consider, and rightfully so. You might be thinking, “What does this mean for extra work on the plates of my resources? How can I ensure the success of my project? Is it worth it to do it now, or should I wait for the next release?” Whatever you may be mulling over, we’ve been there, know how to answer the questions, and have some neat tools in our utility belt to move the process along. In the end, I hope to have presented you with some bits to aid you in making a decision about upgrading to OBIEE 12c, or at least the impetus to start thinking about it.

If you’d like any more information or just want to talk more about the ins and outs of what an upgrade might entail, send over an email or give us a call.

The post Contemplating Upgrading to OBIEE 12c? appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Data Integration Tips: Oracle Data Integrator 12c Password

Mon, 2016-04-25 00:12

Hey, everyone. It’s Sunday night, and we have just enough time for another Data Integration Tip from Rittman Mead. This one has originated from many years of Oracle Data Integrator experience—and a lost 12c password. Let me start first by stating there is never any blame placed when a password is lost, forgotten, or just never stored in a safe place. It happens more often than you might think! Unfortunately, there is no “Forgot password?” link in ODI 12c, which is why I wanted to share my approach to password recovery for these situations.


The Challenge: Lost 12c Password

There are typically two passwords used in Oracle Data Integrator 12c that are forgotten and difficult to recover:

  1. The Work Repository password, created during the setup of the ODI repositories.
  2. The SUPERVISOR user password.

Often there will be more than one ODI user with supervisor privileges, allowing the SUPERVISOR user account password to be reset and making everyone’s life a bit easier. With that, I’ll focus on the Work Repository password and a specific use case I ran into just recently. This approach will work for both lost 12c password instances and I have used it for each in the past.


Now, yes, there is a feature that allows us to change the Work Repository password from within ODI Studio. But (assuming you do have the ability to edit the Work Repository object) as you can see in the image, you also need to know the “current password.” Therein lies the problem.

The Scenario

OK, here we go. The situation I ran into was related to an ODI 11g to 12c upgrade. During the upgrade, we cloned the master and work repositories and set them up on a new database instance in order to lessen the impact on the current 11g repositories. To make this work, a few modifications are required after cloning and before the ODI upgrade assistant can be run. Find more details on these steps in Brian Sauer’s post, Upgrade to ODI 12c: Repository and Standalone Agent.

  • Modify the Work repository connection from within the Master repository. The cloned Master repository is still pointed to the original ODI 11g Work Repository and the connection must be updated.
  • Update the SYSTEM.SCHEMA_VERSION_REGISTRY$ table to add an entry for the cloned ODI repository in the new database instance.
  • Detach the Work Repository from the original Master Repository.

Easy enough. The upgrade assistant completed successfully and everything was working great during testing, until we attempted to open the Work Repository object in ODI:

“Work repository is already attached to another master repository”

Uh-oh. It seems the last bullet point above was skipped. No worries. We have a simple solution to this problem. We can detach the Work Repository from the Master, then attach it once again. Interestingly enough, the action of detaching the repository cleans up the metadata and allows the Work Repository to be added to the cloned master with no problem.

Detaching is easy. Just confirm that you want to remove the Work Repository and poof, it’s gone. It’s the reattaching where we run into an issue…our lost 12c password issue (you knew I was going to bring that up, didn’t you?). Adding a Work repository requires a JDBC connection to a new or existing repository. In this case, we choose the existing repository in our cloned database. The same one we just detached from the Master. Just make sure that you choose to keep the repository contents or you’ll have a much bigger challenge ahead of you.

But then, out of nowhere, we’re prompted for the Work Repository password.


Hmm…well, we set the ODI 11g repository up in 2011. Jim, who installed it for us, doesn’t work here any longer. “Hmm” is right!

Here’s the Tip

Before we go any further, full disclosure—this is most likely not considered a supported action in the eyes of Oracle, so beware. Warning SignAlso, I haven’t attempted to use the ODI SDK and a Groovy script to update a password, so that might be the way to go if you’re concerned about this being a hack. But desperate times require desperate measures, as they say.

In order to “recover” a password for the Work Repository, we must actually change it behind the scenes in the repository tables. There’s a great deal of metadata we can access via the repository schema, and the modification of this data via the schema is not typical nor recommended, but sometimes necessary.

Oracle Support has a Knowledge Base document, Oracle Data Integrator 11g and 12c Repository Description (Doc ID 1903225.1), which provides a nice data dictionary for the repositories. Looking at the ODI 12.2.1 version of the repository definition, we find that the table SNP_LOC_REPW in the Work Repository stores the value for the repository password in the column REP_PASSW. Now the password must be encoded to match the repository and environment, so it cannot simply be added to the table in plain text.

Encoding a password is something that Oracle Data Integrator developers and admins have been doing for years, most often when setting up a Standalone agent. As a part of the agent installation, there is a script called encode.sh (or encode.bat for Windows) that will accept a plain text password as a parameter and output the encoded string. Brilliant! Let’s try it out.

Browse to the ODI agent domain home and drill into the bin directory. From there, we can execute the encode command. A quick look at the script shows us the expected input parameters.


The instance name is actually the Agent name. Ensure the agent is running and fire off the script:

[oracle@ODIGettingStarted bin]$ ./encode.sh -INSTANCE=OGG_ODI_AGENT
2016-04-24 22:00:50.791 TRACE JRFPlatformUtil:Unable to obtain JRF server platform. Probably because you are in JSE mode where oracle.jrf.ServerPlatformSupportFactory is not available which is expected.
2016-04-24 22:00:56.855 NOTIFICATION New data source: [OGG_ODI_REPO/*******@jdbc:oracle:thin:@//localhost:1521/ORCL]
2016-04-24 22:01:01.931 NOTIFICATION Created OdiInstance instance id=1
Enter password to encode:

Now you can enter a password to encode, hit return and boom! Here’s your encoded string.

Enter password to encode:

Let’s take the entire string and write a quick update statement for the Work Repository SNP_LOC_REPW table. Even though I know there is only one Work Repository, I still use a where clause to ensure I’m updating the correct row.

set REP_PASSW = 'ejjYhIeqYp4xBWNUooF31Q=='

Commit the transaction and Bob’s your uncle! Now we can continue on with adding the Work Repository through ODI Studio. Just enter the password used in the encode.sh command and you’re in!

As I mentioned earlier, this same approach can be used to update the SUPERVISOR user password, or really any ODI user password (if they are stored in the repository). In this case, the use of encode.sh is the same, but this time we update the SNP_USER table in the Master repository. The column PASS stores the encoded password for each user. Just remember to change the password everywhere that the user is set to access ODI (agents, etc).

So there you have it. A quick, simple way to “recover” a lost ODI 12c password. Just be sure that this information doesn’t fall into the wrong hands. Lock down your ODI agent file directory to only those administrators who require access. Same goes for the repository schemas. And finally, use this approach in only the most dire situation of a completely lost 12c password. Thanks for reading and look here if you want more DI Tips. Enjoy your week!

The post Data Integration Tips: Oracle Data Integrator 12c Password appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Rittman Mead at Collaborate 16: Data Integration Focus

Mon, 2016-04-04 04:59

It’s that time of year again when Oracle technologists from around the world gather in Las Vegas, Nevada, to teach, learn, and, of course, network with their peers. The Collaborate 16 conference, running for 10 years now, has been a collaboration, if you will, between the Independent Oracle Users Group (IOUG), Oracle Applications Users Group (OAUG), and Quest International Users Group (Quest), making it one of the largest user group conferences in the world. Rittman Mead will once again be in attendance, with two data integration focused presentations by me over the course of the week.

My first session at Collaborate 16, “A Walk Through the Kimball ETL Subsystems with Oracle Data Integration,” scheduled for Monday, April 11, at 10:30 a.m., will focus on how we can implement the ETL Subsystems using Oracle Data Integration solutions. As you know, Big Data integration has been the hot topic over the past few years, and it’s an excellent feature in the Oracle Data Integration product suite (Oracle Data Integrator, GoldenGate, & Enterprise Data Quality). But not all analytics require big data technologies, such as labor cost, revenue, or expense reporting. Ralph Kimball, dimensional modeling and data warehousing expert and founder of The Kimball Group, spent much of his career working to build an enterprise data warehouse methodology that can meet these reporting needs. His book, “The Data Warehouse ETL Toolkit,” is a guide for many ETL developers. This session will walk you through his ETL Subsystem categories: Extracting, Cleaning & Conforming, Delivering, and Managing, describing how the Oracle Data Integration products are perfectly suited for the Kimball approach.

I go into further detail on one of the ETL Subsystems in an upcoming IOUG Select Journal article, titled “Implement an Error Event Schema with Oracle Data Integrator.” The Select Journal is a technical magazine published quarterly and available exclusively to IOUG members. My recent post Data Integration Tips: ODI 12c Repository Query – Find the Mapping Target Table shows a bit of the detail behind the research performed for the article.


If you’re not familiar with the Kimball approach to data warehousing, I definitely would recommend reading one (or more) of their published books on the subject. I would also recommend attending one of their training courses, but unfortunately for the data warehousing community, the Kimball Group has closed shop as of December 2015. But hey, the good news is that two of the former Kimball team members have joined forces at Decision Works, and they offer the exact same training they used to deliver under The Kimball Group name.

GoldenGate to Kafka logo

On Thursday, April 14, at 11 a.m., I will dive into the recently released Oracle GoldenGate for Big Data 12.2 in a session titled “Oracle GoldenGate and Apache Kafka: A Deep Dive into Real-Time Data Streaming.” The challenge for us as data integration professionals is to combine relational data with other non-structured, high volume and rapidly changing datasets, known in the industry as Big Data, and transform it into something useful. Not just that, but we must also do it in near real-time and using a big data target system such as Hadoop. The topic of this session, real-time data streaming, provides us a great solution for that challenging task. By combining GoldenGate, Oracle’s premier data replication technology, and Apache Kafka, the latest open-source streaming and messaging system for big data, we can implement a fast, durable, and scalable solution.

If you plan to be at Collaborate 16 next week, feel free to drop me a line in the comments, via email at michael.rainey@rittmanmead.com, or on Twitter @mRainey. I’d love to meet up and have a discussion around my presentation topics, data integration, or really anything we’re doing at Rittman Mead. Hope to see you all there!

The post Rittman Mead at Collaborate 16: Data Integration Focus appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

ChitChat: The Importance of BI Integrations

Thu, 2016-03-31 05:00

A user’s workflow shouldn’t change to accommodate a new tool. A new tool should fill a gap in the current workflow and help streamline the user’s process. An application without a clearly defined scope eventually overlaps with existing solutions, creating confusion and distress among users. It takes both time and effort to clarify the appropriate situations to use the application, reconcile different use cases and approaches, and resolve incorrect uses. We designed ChitChat with appropriate scopes in mind, implementing key integrations, to fit seamlessly into existing workflows.

What exactly do we mean by “scope?”

Let’s look at an example with JIRA. JIRA owns the complete ticketing process, meaning tickets are stored and maintained by the tool. Using a competing ticket solution, such as Trello, for the same purpose within the organization will cause havoc among users. However, JIRA tickets are still extremely useful outside of the JIRA application. They can be linked to and displayed inside other applications, but they are still maintained by JIRA itself.

If you can recognize that the ticketing management should be handled solely by JIRA, but exposure of those tickets outside of the tool is also important, then you understand the correct scope of the application. The scope of the application does not determine where the context of an application is useful. It only describes what section of a workflow the application has absolute control over. The question isn’t “Where should we be able to view the information?” The question is “Where should the content be maintained?”

ChitChat respects the appropriate scopes of neighboring applications and allows the flexibility to continue maintaining the scopes of these applications. With integrations to Atlassian JIRA and Confluence and Salesforce Chatter, the information you need is available where you need it, without infringing on your existing workflow.

Examples of Integrations

Let’s look at some examples. As we use a BI dashboard, we stumble upon an issue. Using ChitChat, the issue can be identified and a conversation can be made about temporarily working around the problem. However, the IT team uses JIRA to accept issues and resolves them as appropriate. We obviously want the IT team to know of this issue, so we must create a ticket in JIRA as well. Rather than going to JIRA and creating a ticket manually, we can simply export the initial annotation to JIRA. The workflow remains generally identical, but now requires less time and effort. And this comes with the added benefit of the ticket pointing directly to the location of the issue on the dashboard.

In another instance, let’s say our dashboard has some confusing calculations on it, some of which are not immediately recognizable. The formulas used, and the reasons to use such formulas, are available in Atlassian Confluence for us to view. However, not all users have a Confluence account, and even fewer have access to the document. We could copy and paste the calculations as a document using ChitChat, but now we have two separate instances of the same information. If the calculations are changed, we must ensure both locations are accurate. Alternatively, ChitChat can sync directly with Confluence and pull a page into the application. The page guarantees accuracy by consistently pulling new updates from Confluence, as well as pushing updates to Confluence if the content is changed in ChitChat.

These approaches allow the JIRA ticket and Confluence document to be maintained in the appropriate location, while also being available in a useful context. Chitchat does not impede on the purposes of other applications. ChitChat offers integrations that seamlessly enhance your workflow without making it convoluted. Our tool is designed specifically to fill the missing pieces in your BI workflow, allowing for a seamless transition between analysis and communication.

To learn more about ChitChat’s many commentary features, or to request a demo, click here.

The post ChitChat: The Importance of BI Integrations appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

New OTN Article – OBIEE Performance Analytics: Analysing the Impact of Suboptimal Design

Wed, 2016-03-30 03:09

I’m pleased to have recently had my first article published on the Oracle Technology Network (OTN). You can read it in its full splendour and glory(!) over there, but I thought I’d give a bit of background to it and the tools demonstrated within.

OBIEE Performance Analytics Dashboards

One of the things that we frequently help our clients with is reviewing and optimising the performance of their OBIEE systems. As part of this we’ve built up a wealth of experience in the kind of suboptimal design patterns that can cause performance issues, as well as how to go about identifying them empirically. Getting a full stack view on OBIEE performance behaviour is key to demonstrating where an issue lies, prior to being able to resolve it and proving it fixed, and for this we use the Rittman Mead OBIEE Performance Analytics Dashboards.

OBIEE Performance Analytics

A common performance issue that we see is analyses and/or RPDs built in such a way that the BI Server inadvertently returns many gigabytes of data from the database and in doing so often has to dump out to disk whilst processing it. This can create large NQS_tmp files, impacting the disk space available (sometimes critically), and the disk I/O subsystem. This is the basis of the OTN article that I wrote, and you can read the full article on OTN to find out more about how this can be a problem and how to go about resolving it.

OBIEE implementations that cause heavy use of temporary files on disk by the BI Server can result in performance problems. Until recently in OBIEE, it was really difficult to track because of the transitory nature of the files. By the time the problem had been observed (for example, disk full messages), the query responsible had moved on and so the temporary files deleted. At Rittman Mead we have developed lightweight diagnostic tools that collect, amongst other things, the amount of temporary disk space used by each of the OBIEE components.


This can then be displayed as part of our Performance Analytics Dashboards, and analysed alongside other performance data on the system such as which queries were running, disk I/O rates, and more:

OBIEE Temp Disk Usage

Because the Performance Analytics Dashboards are built in a modular fashion, it is easy to customise them to suit specific analysis requirements. In this next example you can see performance data from Oracle being analysed by OBIEE dashboard page in order to identify the cause of poorly-performing reports:

OBIEE Database Performance Analysis

We’ve put online a set of videos here demonstrating the Performance Analytics Dashboards, and explaining in each case how they can help you quickly and accurately diagnose OBIEE performance problems.

You can read more about our Performance Analytics offering here, or get in touch to find out more!

The post New OTN Article – OBIEE Performance Analytics: Analysing the Impact of Suboptimal Design appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

The Importance of BI Documentation

Thu, 2016-03-17 05:00
Why Is BI Documentation Important?

Business intelligence systems come with a lot of extra information. Even beautifully constructed analyses have piles of background information and histories. Administrators might often have memos and updates that they’d like share with analysts. Sales figures might have anomalies that need further explanation. But OBIEE does not currently have any options for BI Documentation inside the dashboard.

Let’s say a BI user for a cell phone distribution company is viewing a report comparing the yearly sales figures for several different cell phones. If the analyst notices that one specific cell phone is outperforming the others, but doesn’t know what makes that specific model unique, then they have to go searching for that information.

But what if the individual phone model specifications and advertising and marketing histories were already included as reports inside the dashboard? What if the analyst, with only a couple of clicks, discovered that the reason one cell phone was outperforming the others was due to its next-gen screen, camera, and chip upgrades, which proved popular with consumers? Or what if the analyst discovered that the popular phone, while containing outdated peripherals, was selling so well because a Q3 advertising push for that model only? All of this information might not be contained in the dashboard’s visuals, but greatly affects the analysts’ understanding of the reports.

Current Options for OBIEE Documentation

Some information can be displayed as visuals, but many times this isn’t a practical solution. Besides making dashboards too cluttered, memos, product descriptions, company directories, etc., are not practical as charts and graphs. As of right now, important documentation can be stored in a wide range of places outside of the BI dashboard, but the operating reality at most organizations means that important information is spread across several locations and not always accessible to the people who need it.

Workarounds are inefficient, cost time, cause BI users to leave the BI environment (potentially reducing usage), and increase frustration. If an analyst has to email several different people to locate the information she wants, that complicates her workflow and produces extraneous communications (who likes answering emails?). Before now, there wasn’t an easy solution to these problems.

ChitChat’s BI Documentation Features

With ChitChat, it’s now possible to store critical documentation where it belongs—at the source of the conversation. Keep phone directories, memos from administrators (or requests from analysts to administrators), product descriptions, analytical histories—really, the possibilities are endless—inside the dashboard where they are accessible to the people who need them. Shorten workflows and make life easier for your BI users.

ChitChat’s easy-to-use functionality allows BI users to copy and paste or write (ChitChat has a built-in WYSIWYG text editor) important information inside the BI dashboard, creating a quicker path to insightful and actionable analytics. And isn’t that the goal in the end?

To learn more about ChitChat’s many commentary features, or to request a demo, click here.

The post The Importance of BI Documentation appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

ASO Slice Clears – How Many Members?

Mon, 2016-03-14 05:00

Essbase developers have had the ability to (comparatively) easily clear portions of our ASO cubes since version 11.1.1, getting away from fiddly methods involving manually contra-ing existing data via reports and rules files, making incremental loads substantially easier.

Along with the official documentation in the TechRef and DBAG, there are a number of excellent posts already out there that explain this process and how to effect “slice clears” in detail (here and here are just two I’ve come across that I think are clear and helpful). However, I had a requirement recently where the incremental load was a bit more complex than this. I am sure people must have fulfilled in the same or a very similar way, but I could not find any documentation or articles relating to it, so I thought it might be worth recording.

For the most part, the requirements I’ve had in this area have been relatively straightforward—(mostly) financial systems where the volatile/incremental slice is typically a months-worth (or quarters-worth) of data. The load script will follow this sort of sequence:

  • [prepare source data, if required]
  • Perform a logical clear
  • Load data to buffer(s)
  • Load buffer(s) to new database slice(s)
  • [Merge slices]

With the last stage being run here if processing time allows (this operation precludes access to the cube) or in a separate routine “out of hours” if not.

The “logical clear” element of the script will comprise a line like (note: the lack of a “clear mode” argument means a logical clear; only a physical clear needs to be specified explicitly):

alter database ‘Appname‘.’DBName‘ clear data in region ‘{[Jan16]}’

or more probably

alter database ‘Appname‘.’DBName‘ clear data in region ‘{[&CurrMonth]}’

i.e., using a variable to get away from actually hard coding the member values to clear. For separate year/period dimensions, the slice would need to be referenced with a CrossJoin:

alter database ‘Appname‘.’DBName‘ clear data in region ‘Crossjoin({[Jan]},{[FY16]})’ alter database ‘${Appname}’.’${DBName}’ clear data in region ‘Crossjoin({[&{CurrMonth]},{[&CurrYear]})’

which would, of course, fully nullify all data in that slice prior to the load. Most load scripts will already be formatted so that variables would be used to represent the current period that will potentially be used to scope the source data (or in a BSO context, provide a FIX for post-load calculations), so using the same to control the clear is an easy addition.

Taking this forward a step, I’ve had other systems whereby the load could comprise any number of (monthly) periods from the current year. A little bit more fiddly, but achievable: as part of the prepare source data stage above, it is relatively straightforward to run a select distinct period query on the source data, spool the results to a file, and then use this file to construct that portion of the clear command (or, for a relatively small number, prepare a sequence of clear commands).

The requirement I had recently falls into the latter category in that the volatile dimension (where “Period” would be the volatile dimension in the examples above) was a “product” dimension of sorts, and contained a lot of changed values each load. Several thousand, in fact. Far too many to loop around and build a single command, and far too many to run as individual commands—whilst on test, the “clears” themselves ran satisfyingly quickly, it obviously generated an undesirably large number of slices.

So the problem was this: how to identify and clear data associated with several thousand members of a volatile dimension, the values of which could change totally from load to load.

In short, the answer I arrived at is with a UDA.

The TechRef does not explicitly say or give examples, but because the Uda function can be used within a CrossJoin reference, it can be used to effect a clear: assume the Product dimension had an UDA of CLEAR against certain members…

alter database ‘Appname‘.’DBName‘ clear data in region ‘CrossJoin({Uda([Product], “CLEAR”)})’

…would then clear all data for all of those members. If data for, say, just the ACTUAL scenario is to be cleared, this can be added to the CrossJoin:

alter database ‘Appname‘.’DBName‘ clear data in region ‘CrossJoin({Uda([Product], “CLEAR”)}, {[ACTUAL]})’

But we first need to set this UDA in order to take advantage of it. In the load script steps above, the first step is prepare source data, if required. At this point, a SQLplus call was inserted to a new procedure that

  1. examines the source load table for distinct occurrences of the “volatile” dimension
  2. populates a table (after initially truncating it) with a list of these members (and parents), and a third column containing the text “CLEAR”:


A “rules” file then needs to be built to load the attribute. Because the outline has already been maintained, this is simply a case of loading the UDA itself:


In the “Essbase Client” portion of the load script, prior to running the “clear” command, the temporary UDA table needs to be loaded using the rules file to populate the UDA for those members of the volatile dimension to be cleared:

import database ‘AppName‘.’DBName‘ dimensions connect as ‘SQLUsername‘ identified by ‘SQLPassword‘ using server rules_file ‘PrSetUDA’ on error write to ‘LogPath/ASOCurrDataLoad_SetAttr.err’;



With the relevant slices cleared, the load can proceed as normal.

After the actual data load has run, the UDA settings need to be cleared. Note that the prepared table above also contains an empty column, UDACLEAR. A second rules file, PrClrUDA, was prepared that loads this (4th) column as the UDA value—loading a blank value to a UDA has the same effect as clearing it.

The broad steps of the load script therefore become these:

  • [prepare source data, if required]
  • ascertain members of volatile dimension to clear from load source
  • update table containing current load members / CLEAR attribute
  • Load CLEAR attribute table
  • Perform a logical clear
  • Load data to buffers
  • Load buffer(s) to new database slice(s)
  • [Merge slices]
  • Remove CLEAR attributes

So not without limitations—if the data was volatile over two dimensions (e.g., Product A for Period 1, Product B for Period 2, etc.) the approach would not work (at least, not exactly as described, although in this instance you could possible iterate around the smaller Period dimension)—but overall, I think it’s a reasonable and flexible solution.

Clear / Load Order

While not strictly part of this solution, another little wrinkle to bear in mind here is the resource taken up by the logical clear. When initializing the buffer prior to loading data into it, you have the ability to determine how much of the total available resource is used for that particular buffer—from a total of 1.0, you can allocate (e.g.) 0.25 to each of 4 buffers that can then be used for a parallel load operation, each loaded buffer subsequently writing to a new database slice. Importing a loaded buffer to the database then clears the “share” of the utilization afforded to that buffer.

Although not a “buffer initialization” activity per se, a (slice-generating) logical clear seems to occupy all of this resource—if you have any uncommitted buffers created, even with the lowest possible resource utilization of 0.01 assigned, the logical clear will fail:


The Essbase Technical Reference states at “Loading Data Using Buffers“:

While the data load buffer exists in memory, you cannot build aggregations or merge slices, as these operations are resource-intensive.

It could perhaps be argued that as we are creating a “clear slice,” not merging slices (nor building an aggregation), that the logical clear falls outside of this definition, but a similar restriction certainly appears to apply here too.

This is significant as, arguably, the ideally optimum incremental load would be along the lines of

  • Initialize buffer(s)
  • Load buffer(s) with data
  • Effect partial logical clear (to new database slice)
  • Load buffers to new database slices
  • Merge slices into database

As this would both minimize the time that the cube was inaccessible (during the merge), and also not present the cube with zeroes in the current load area. However, as noted above, this does not seem to be possible—there does not seem to be a way to change the resource usage (RNUM) of the “clear,” meaning that this sequence has to be followed:

  • Effect partial logical clear (to new database slice)
  • Initialize buffer(s)
  • Load buffer(s) with data
  • Load buffers to new database slices
  • Merge slices into database

I.e., the ‘clear’ has to be fully effected before the initialization of the buffers. This works as you would expect, but there is a brief period—after the completion of the “clear” but before the load buffer(s) have been committed to new slices—where the cube is accessible and the load slice will show as “0” in the cube.

The post ASO Slice Clears – How Many Members? appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Use OBIEE to Achieve Your GOOOALS!!! – A Presentation for GaOUG

Thu, 2016-03-10 04:00


A few months before the start of the 2014 World Cup, Jon Mead, Rittman Mead’s CEO, asked me to come up with a way to showcase our strengths and skills while leveraging the excitement generated by the World Cup. With this in mind, my colleague Pete Tamisin and I decided to create our own game-tracking page for World Cup matches, similar to the ones you see on popular sports websites like ESPN and CBSSports, with one caveat: we would build the game-tracker inside an OBIEE dashboard.

Unfortunately, after several long nights and weekends, we weren’t able to come up with something we were satisfied with, but we learned tons along the way and kept a lot of the content we created for future use. That future use came several months later when we decided to create our own soccer match (“The Rittman Mead Cup”) and build a game-tracking dashboard that would support this match. We then had the pleasure to present our work in a few industry conferences, like the BI Forum in Atlanta and KScope in Hollywood, Florida.

GaOUG Tech Day

Recently I had the privilege of delivering that presentation one last time, at Georgia Oracle Users Group’s Tech Day 2016. With the right amount of silliness (yes, The Rittman Mead cup was played/acted by our own employees), this presentation allowed us to discuss with the audience our approach to designing a “sticky” application; meaning, an application that users and consumers will not only find useful, but also enjoyable, increasing the chances they will return to and use the application.

We live in an era where nice, fun, pretty applications are commonplace, and our audience expects the same from their business applications. Validating the numbers on the dashboard is no longer enough. We need to be able to present that data in an attractive, intuitive, and captivating way. So, throughout the presentation, I discussed with the audience the thoughtful approach we used when designing our game-tracking page. We focused mainly on the following topics: Serving Our Consumers; Making Life Easier for Our Designers, Modelers, and Analysts; and Promoting Process and Collaboration (the latter can be accomplished with our ChitChat application). Our job would have been a lot easier if ChitChat were available when we first put this presentation together….

Finally, you can find the slides for the presentation here. Please add your comments and questions below. There are usually multiple ways of accomplishing the same thing, so I’d be grateful to hear how you guys are creating “stickiness” with your users in your organizations.

Until the next time.

The post Use OBIEE to Achieve Your GOOOALS!!! – A Presentation for GaOUG appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

The Importance of BI Commentary

Mon, 2016-03-07 04:00
Why Is Commentary Important?

We communicate every day. Communication through text is especially abundant with the proliferation of new on-demand technologies. Have you gone through your emails today? Have you read the news, weather, or blogs (like this one)? Communication is the backbone to every interpersonal interaction. Without it, we are left guessing and assuming.

BI implementations are no exception when it comes to communication’s importance, and I would argue communication is a major component of every BI environment. The goal of any BI application is to discover and expose actionable information from data, but without collaboration, discovering insights becomes difficult. By allowing users to collaborate immediately in the BI application, new insights can be discovered quicker.

Any BI conversation should maintain its own dedicated communication channel, and the optimal place for these conversations is as close to the information-consumption phase as possible. By allowing users to collaborate in discussions over results at the same location as the data, users will be empowered to extract as much information as possible.

Unfortunately, commentary support is absent from OBIEE.

The Current OBIEE Communication Model

The lack of commentary support does not stop the community from developing their own methods or approaches to communicating within their BI environments. Right now, common approaches include purchasing pre-developed software, engineering custom solutions, or forcing the conversations into other channels.

Purchasing a commentary application or developing your own internal solutions expedites the user communication process. However, what about those who do not find a solution, and instead decide to use a “work-around” approach?

Choosing to ignore the missing functionality is the cheapest approach, initially, but may actually cost more in the long run. To engage in simple conversations, users are required to leave the BI dashboard, which adds time and difficulty to their daily processes. And reiterating the context of a conversation is both time consuming and error prone.

Additionally, which communication channel will the BI conversations invade? A dedicated communication channel, built specifically to easily display and relay the BI topics of interest, is the most efficient, and beneficial, solution.

How ChitChat Can Help

ChitChat provides a channel of communication directly within the BI environment, allowing users to engage in conversations as close to the data consumption phase as possible. Users will never be required to leave the BI application to engage in a conversation about the data, and they won’t need to reiterate the environment through screenshots or descriptions.

Recognizing the importance of separate channels of communication, ChitChat also easily allows each channel to maintain their respective scopes. For instance, a user may discover an error on a BI dashboard. Rather than simply identifying the error in the BI environment, the user can export the comment to Atlassian JIRA and create a ticket for the issue to be resolved, thus maintaining the appropriate scopes of both JIRA and ChitChat. Integrations allow existing channels of communication to maintain their respective importance, and appropriately restrict the scope of conversations.

ChitChat is placed in the most opportune location for BI commentary, while maintaining the correct scope of the conversation. Other approaches often ignore one of these two aspects of BI commentary, but both are required to efficiently support a community within a BI environment. The most effective solution is not one that simply solves the problem, or meets some of the criteria, but the solution that meets all of the requirements.

Commentary Made Simple

Conversation around a BI environment will always occur, regardless of the supporting infrastructure or difficulty in doing so. Rather than forcing users to spend time working around common obstacles or developing their own solutions, investing in an embedded application will save both time and money. These offerings will not only meet the basic requirements, but also ensure the best experience for users, and the most return on investment.

Providing users the exact features they need, where they need it, is one step in nurturing a healthy BI environment, and ChitChat is an excellent solution to meet these criteria.

To find out more about ChitChat, or to request a demo, click here!

The post The Importance of BI Commentary appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

OBIEE 12c – Your Answers After Upgrading

Thu, 2016-03-03 04:00

Several blogs have already been written about new functionality in OBIEE 12c. Mark Rittman, for example, posted a good one here.

Now, I’ve personally had the chance to play with it for a few weeks, mostly in Answers and some with the RPD, and wanted to share my experience. With a sleek interface and many new functionalities, 12c brings some very useful features that users will appreciate. As with most new software releases, I expected to find issues that needed to be worked out. In general, I was pleasantly surprised with the UI, the speed, and the intuitiveness that came along with OBIEE 12c.

Here, I’ll share with you some of the new features within Answers:

Percent Calculation

If you’ve created lots of percent variance columns, it’s probably second nature that you will create your formula and then multiply by 100. In 12c, you can create your percent calculation without multiplying it by 100, then set your % data formatting in the Column Properties. In the same spot where you specify how the data is displayed, you can check the x100 box, which in turn will automatically multiply your results from that column by 100. Pretty sleek solution to simplify your formulas.

Percent Calc

Saved Columns

This feature is very well described here, so I will give a high level overview: 12c gives you a very easy way to save a complex formula into the catalog. If you’ve built a lot of logic in a column’s formula, and would like to reuse the logic in future reports, you will appreciate the opportunity of saving columns. I remember creating many financial calculations that had to be reused often, and until now there was no easy way to retrieve the column formulas. Trying to simplify my life, I ended up inventing “my own method” of saving complex calculations by saving different analyses that I named as “Master – Calculation” containing the columns that I reused often. I would start many reports based on these Master reports because they had my pre-built formulas; however, this was not a clean method for others to follow. OBIEE 12c gives you this clean and simple method for storing and reusing your most wanted columns. You do this by entering your formula in edit formula and choosing to “Save Column as” for future use.

Calculated Columns

OBIEE 12c provides a more intuitive way to create calculated columns than previous versions. In 10g or 11g, you needed to add a “whatever” column to the query, and then go in Edit Formula to define the calculation for your new column. While this worked, most new users often wondered why they were “bringing in two revenue columns,” for example. In 12c, you can add only the needed columns to your Criteria, then go straight to Results. In the Results tab, there is a New Calculated Measure icon that brings you immediately to the Edit Formula screen where you can name your new measure and define its formula.

Calc Columns

Measure Abbreviation

There is also a more intuitive abbreviation of the measures that are placed on a graph. In 11g, when you dragged an amount to an axis, you may recall that the numbers would show up exactly as the raw number. So, if your result was 12,000,000, then that was exactly what you would see on the graph to begin. If you wanted to improve your graph, then you needed to go to the Graph Properties and format the data from the axis to be abbreviated into, for our example above, millions (or 12M). To save you a step, 12c will automatically abbreviate your graph data in the most user-friendly way. So, if the data is 12,000,000, you automatically get 12M!

Measure Abbreviation.png

Heat Matrix

Easy to use heat matrix!—I mean it: easy. While in 11g, you would have to be somewhat visually savvy and spend a lot of time conditionally formatting. OBIEE 12c gives you a tool that allows you to create a meaningful heat matrix in a matter of minutes—wait—even seconds. All you need is to know the two dimensions and one measure that you would like to use, and drag and drop them. Choose from an array of color schemas and how you would like to use the colors. In no time, your heat matrix is ready.

Heat Map


A new member of the OBIEE family is here to provide a visual solution for very complex activities. The Treemap provides a hierarchical structure that allows you to quickly spot patterns and outliers. At first, it may require a bit of head twisting to look at a graph like this, but remember, this is indeed a graph for complex activities. One of the most ideal usages for this new feature is the grouping by parent/children groups and the displaying of how two measures fair up inside each group.


Advanced Analytics

OBIEE 12c gives you the capability of working with statistical and R functions right from the ‘Edit formula’ pane. While I have found that this new feature was still not very user friendly, it’s a lot easier than making this functionality work in 11g. For example, to create a simple Trendline with 11g, the developer had to slowly build each step of a calculation to find the slope of a line, and then find the Y intercept. With these answers in hand, the results had to be carefully placed on a graph, so that it could render meaningful results. If you require statistical graphs within OBIEE, 12c may be a great fit for you. For example, below is a graph showing four different Trendlines:


The Criteria for building these four lines would be very intense in 11g; but in OBIEE 12c, it contained only five columns: one for the Calendar Year, and one for each Trendline. The Trendlines were created one at a time, by inserting the new “Analytics” Function in the column’s formula (see below).

AA Combo

Data Mashup

This is a dream come true to many of us, though it requires an optional data visualization license. With this new functionality, you are able to use OBIEE along with any excel spreadsheet (XSA) saved on your machine.

You can add a spreadsheet to OBIEE from two areas:

  1. When you are creating an analysis (in the Criteria tab, and then choosing to add data source as shown below), or

Add Data Source

2. By going to the Visual Analyzer Home Page.

As this blog focuses on Answers, I will review the first option here.

There are three possible ways of analyzing a spreadsheet in Answers. You either want to:

  1. Analyze the spreadsheet by itself, or
  2. Use attributes from the spreadsheet along with fact data from your enterprise system, or
  3. Use fact data from your spreadsheet along with attributes and facts from your enterprise system.

For options 2 and 3 to work properly, it is important that your joins are properly matched (watch your cardinalities!) from your spreadsheet to your enterprise data. Also, as usual, option 3 will only work along with another fact table when the two tables are joined to a conformed dimension. Cardinalities and conformed dimensions are items that we generally take for granted when working on front-end OBIEE, because these points have been carefully handled during RPD modeling. Since the spreadsheet modeling has to be done in the front end, special caution must be used when modeling them in order to avoid “exploded” results, or simply inaccurate results.

Word of caution on placing an XSA sourced analysis in a shared folder:

Once you create an analysis using a spreadsheet and save it to a shared folder, you will receive this message:


Once you choose “YES,” the spreadsheet will show as a new subject area—for you and for anyone who has access to the folder in which you placed the analysis, meaning that the catalog security just TOOK CONTROL of your spreadsheet! Below is a screenshot of how they show as new subject areas:

Subject Areas

So, if your intent was to share an analysis from a XSA, but not necessarily share the entire spreadsheet to be reused, you may want to restrict your analysis to a folder with the specific securities that you would like to apply to your spreadsheet. BUT…think carefully before saving the analysis in a shared folder. If you realize that you made a mistake, just know that deleting your analysis from the incorrect folder will NOT remove your spreadsheet as an available subject area for other users. Remember, the catalog security took control of your spreadsheet, and it’s not going to let it go! If you saved the analysis in a folder with incorrect permissions, you must delete the spreadsheet altogether from the tool, reload it, and then save the analysis in the correct folder (with the permissions that you want).

You will likely need in-depth information regarding mashup security once you are really working with it. Check out this Oracle doc for more info.

Word of caution when archiving an analysis containing a spreadsheet, or when moving that analysis between environments:

The username of the owner of the analysis gets embedded in the column formula, and so does the precise name that you gave your spreadsheet when you first loaded it. So, let’s say that you are transitioning environments and the new environment does not contain your spreadsheets. If someone else has an archived catalog containing one of your mashup queries, they will get an error when retrieving results for your query, because the tool doesn’t have your spreadsheet loaded yet. The only way for them to unarchive your analysis and retrieve results is for YOUR USER to log into OBIEE, load the original spreadsheet (saving it with the same exact name as before), and then saving the analysis in the proper shared folder once again.

Deleting the New Subject Area

One tricky thing in this new tool: even if you uploaded your spreadsheet (XSA) during an analysis in OBIEE, it can only be deleted from the “New Home Page,” which is the Home Page of Visual Analyzer. You can get to the “New Home Page” from the “Old Home Page”:

New Home Page

Once in the New Home Page, click on Data Sources. Choose your Data Source and delete it!

Delete SA

I confess that I had some trouble finding the delete button. Maybe I would have bumped into it had I played more with VA, but that was not the case. Regardless, I felt relieved that this button existed somewhere!

Data Mashup Performance

This was a bit of an issue, but mostly when combined with the Advanced Analytics functions. From my research and from talking to colleagues, I found that the following must be observed to optimize performance:

  1. Reduce the size of your spreadsheet, when possible.
  2. DB indexing on the field that you are joining.
  3. Proper cardinality on your mashup joins with your DB data.
  4. Set up caching for mashups on bi server.

Overall, the experience in OBIEE 12c Answers was very positive, and the new features could bring a great deal of time savings for any organization!

To learn more about all that OBIEE 12c has to offer, check out our upcoming bootcamps here.

Hope to see you then!

The post OBIEE 12c – Your Answers After Upgrading appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing