Don’t Regret – Redact

July 6, 2016

By: Jon Chartrand – Director of Product Management

The concept of sensitive information management is germane to pretty much every business, organization, and public sector outfit in the world. Typically, this sensitive information is classified as “PII” or Personally Identifiable Information – this would be any data which could lead to someone being personally identified and includes things like social security numbers, date of birth, and phone numbers. Other data, often revolving around financial information, includes credit card numbers, bank account numbers, and account balances. All of these data points must be carefully monitored and masked before documents can potentially be made available for distribution – externally or internally. Failure to do so can lead to devastating legal and financial consequences, bankrupting corporations and governments alike. As experts in the field of content management and in bringing order to unstructured data, we felt an obligation to assist our clients with this often expensive and time-consuming effort.

Examples of PII, according to the National Institute of Standards and Technology (NIST)1:

Name Street Address State Zip Code
Telephone Number Email Address Social Security Number Medical Record Number
Health Plan Number Account Number Account Balance ACH Number
Bank Account Routing Number Credit Card Number CCV Code
Driver’s License Number Passport Number Taxpayer ID Date of Birth


Just these example values represent a staggering amount of data across potentially every piece of content your organization creates, updates, manages, stores, distributes, and archives. The compliance costs required to scour content for this data can be monumental in terms of both dollars and hours. However, these costs can pale in comparison to the costs associated with a data breach. A recent study found that the average total cost of a data breach in the US can exceed $7 million, with an average per-record cost of more than $2002. These are some frightening numbers. So how do we help strengthen your compliance efforts while also reducing your compliance costs? That is the question we asked ourselves several months ago and the answer, we believe, is the TEAM Redaction Engine.

We built the Engine to meet three specific needs:

  • textual pattern matching in digital documents
  • integration with scanning solutions for paper documents
  • redaction of identified data in both PDFs and images

The Redaction Engine is a plugin, or component, for Oracle’s WebCenter Content (WCC) platform. This was done because WCC is a leader in the Enterprise Content Management space and it has direct integrations with powerful scanning solutions, Oracle’s cloud-based platforms, and powerful search options such as Elasticsearch. Other than enabling scanning, the component requires no additional software or hardware to perform its functions against the content in your repository – which is a revolution in the sensitive information arena.

Pattern Matching

When it comes to assisting with sensitive information compliance, the primary challenge comes in the form of identifying the data in question. Between our efforts with WebCenter Content and with Elasticsearch in the enterprise content management space, we realized that we already have access to every character of every piece of digital content that’s been indexed. What it boils down to is identifying patterns and developing a method for seeking those patterns in the available data. Look again at the table of examples above. Of the 20 data points described, 18 of them (90%!) can easily be identified based on a likely pattern.  This is where we started on our efforts.

The Redaction Engine is focused around a primary core – the Pattern Matching Engine. We allow you to craft a series of patterns using both Regular Expressions and Simple Patterns. To identify Social Security Numbers, for example, you’ll need to take into account the common variation which lacks the dashes. You could choose to use two simple patterns if you weren’t interested in specifics of SSN rules:

  • (with dashes) ###-##-####
  • (without dashes) #########

These would pick up Social Security Numbers but would also incorrectly identify any numeric value which fits this form but doesn’t actually meet certain rules for SSN’s such as that no group of digits can be all zeroes. We could instead craft a regular expression which is much more robust and is designed to meet the rules of SSN’s laid out by the Social Security Administration3:

  • ^(?!219-09-9999|078-05-1120)(?!666|000|9\d{2})\d{3}-(?!00)\d{2}-(?!0{4})\d{4}$
  • ^(?!219099999|078051120)(?!666|000|9\d{2})\d{3}(?!00)\d{2}(?!0{4})\d{4}$

This is an example of how simple and also how robust the pattern matching can be. These same tactics patternmatchingcan be applied to matching pretty much any other predictably-formatted value. The only question is the depth of complexity you want to apply to the efforts. Given that Regular Expression experts are fairly rare, we also included an expression evaluator in the interface. This provides feedback on your expressions and confirms whether each pattern makes sense to the engine or not.

Now that the patterns are configured, WebCenter Content does the heavy lifting during the check-in process of opening the document and extracting the text within so that the document’s contents can be indexed. This indexing means you can search for a word inside the document instead of just the title or metadata. It also means we have a readily available block of extracted text that we can quickly parse against our patterns and identify desired information. Once identified, we simply hand the PDF to an editing library which adds the redaction, burns it into the document, and saves a new copy as a “Redacted Rendition”. The new PDF even remains full-text searchable – it just has the redacted text removed! This is the simplest – and most common – scenario.

Scanning Integration

Less common but no less important are image-based, or scanned, documents.  As paper documents are still a fact of life, we always want to keep an eye on our methods for digitizing that physical content to bring it into the repository. Whether that’s a simple WebCenter Capture setup or some other scanning platform, the important piece is that we get this now-digital item into a managed structure such as WebCenter Content. If your choice is to stick with the WebCenter family, the Redaction Engine is specifically enhanced to work intimately with both Capture and Oracle Forms Recognition (OFR). One of the best examples of this partnership is with content that contains non-digital text, that is, handwriting.

After the paper item is digitized via the scanner and Capture, it’s passed to OFR for processing. This is where we set up “markers” and instruct OFR where to look for characters in a specific location. Even if Oracle Forms cannot interpret the handwriting (via Optical Character Recognition or OCR) it can identify the precise coordinates for the location of the handwriting. Now we simply pass the digitized document and the coordinates to WebCenter Content and the Redaction Engine.

redaction1                                     redaction2

In the end we have a perfectly redacted entry even though the text wasn’t readable by a character recognition engine. This means that as long as we can find digital “landmarks” in our document, we can train Oracle Forms Recognition to look for and identify illegible entries and pass those for redaction.

If, however, your solution for scanning physical documents does not include WebCenter Capture or Oracle Forms Recognition, the Redaction Engine is happy to work with those items as well.


A Bad Fax

In fact, any image-based content can be passed through the Redaction Engine as we’ve included an OCR library with the product. This means not only image-based PDFs but native TIFF, JPEG, or GIF files can be processed as well. The Redaction Engine OCR library will process the content item and scan for any machine-readable English text that it can find. Of course, like with any OCR process, there are limitations in terms of language, fonts, and file resolution however the vast majority of modern scanned documents will have no problems being read. If you’re submitting documents sent via fax machine in 1997 and then digitized with a consumer-grade scanner a year later, you could very well run into issues.

Something extra on this front comes from the fact that we’re finding text in these images – search. While WebCenter Content would not ordinarily be able to include these content items in the full text search index, we’ve joined the Redaction Engine with TEAM’s Elasticsearch Integration to make this happen. That mean’s any text found when an image or image-based item is passed through the engine is submitted to the Elasticsearch index, making it fully searchable. This means, for example, that a scanned invoice could possibly be found by searching for the vendor name, or the invoice ID, or the invoice total and not just by the metadata that was associated to the item at check-in.

Responsible Redaction

We’ve now covered three specific cases where content can be redacted:

  1. via full-text matching of the document contents
  2. via sets of coordinates passed to the Engine
  3. via pattern and location matching of OCR text in an image or image-based item

In all cases the Redaction Engine creates a new, specifically-redacted content item that is separate and unique from the original file. The redactions are also “burned in” to the new file ensuring that the underlying text is permanently removed. Both of these steps are taken to first ensure that no data is lost for the redaction process and, second, to simultaneously ensure that redacted items are secure in terms of information removal.

The last piece of what we have come to call “responsible redaction” is the auditing capability of the Redaction Engine. The product keeps a record of every redaction performed – not just at a document level but at the redaction level. A single content item with several redactions has every individual redaction logged, including the specific pattern that was matched in each case. Redaction Reports can be generated for any date range desired and can be exported as a Microsoft Excel document. This exported document can now be stored as a managed record in WebCenter Content or maintained elsewhere for legal purposes. The goal in all cases is simply to provide as much transparency as possible into a process that is built to, well, do the exact opposite!


The Redaction Engine is not only about helping to lessen the burden on businesses that have to manually parse, identify, and redact sensitive information but to also bolster those on-going information compliance efforts and keep trouble from finding the front door. As we’ve worked on this effort, I’ve come to find a much greater appreciation for the efforts that must be undertaken to try and keep our information safe and secure. As a group, we’re incredibly pleased to be able to offer a solution that could very well save you and your business time, money, and headaches.


1 “Guide to Protecting the Confidentiality of Personally Identifiable Information (PII)”, NIST, April 2010

2 “2016 Cost of Data Breach Study: United States”, Ponemon Institute, June 2016

3 “Validating Social Security Numbers through Regular Expressions”, Rion Williams,, Sep 2013

TEAM Informatics’ Intelligent Content A smart solution for businesses to manage and control content

May 19, 2016

By: Jon Chartrand – Director of Product Management

Perhaps the primary conceit when it comes to content management is this: context is king. When your content or records have context, it means they can be both cataloged and discovered with much greater ease. When we talk about context, that means metadata – or data describing data. When a document is placed into your content management system it’s important to know who it came from, who it belongs to, what the data within is regarding, and every other aspect of context that can be known, implied, or assumed. This allows the system to catalog the item appropriately and other users to search for and locate the item easily. The problem is that while context is king, entering metadata can be a royal pain – and bad metadata can ruin an otherwise good system. As we all know: garbage in – garbage out.

Picture1TEAM’s been working in the content management space for over a decade so we’ve seen this issue arise repeatedly for our clients. Relying on end users for full, complete, and accurate metadata puts stress on them, slows down the contribution process, and can lead to human error or, even worse, human disinterest. So we set out to not only solve this problem but revolutionize how context is achieved for your content. We partnered with SmartLogic and combined the power of Oracle WebCenter Content with their extraordinary context classification software, Semaphore, to create a unified, smart solution.

This is Intelligent Content.

What is Intelligent Content and how does it work?

TEAM’s Intelligent Content solution alleviates the challenges and roadblocks of requiring users to navigate the metadata process by doing the work for them. This is started by the user simply saving their content to the WebCenter Content repository. The content can be contributed automatically by line-of-business systems or even ingested from network drives or cloud-based file systems. The Intelligent Content engine processes the stored material and leverages an information classification model, or “ontology”, rather than the traditional two-dimensional taxonomy. Intelligent Content drives the auto-classification process by opening each document at contribution time and parsing the content of the document. It is then able to automatically populate metadata based on the rules of the classification model. By automatically tagging your materials, it makes your content easily findable across what would have previously been multiple taxonomic pathways.



Perhaps an example can help here. Imagine an overview document that describes a land use project to build a park. The document may contain sections on project planning, soil samples, a work breakdown, price estimates, and more. In the old-school method the Project Manager checks the item into the repository and, on reflection, classifies the item as a Project Document type item with a subtype of Overview. This is helpful, but really doesn’t encompass the breadth and depth of what’s in the document. In the new-school method, Intelligent Content parses the text and applies predefined classifications; overview, soil samples, work breakdown, pricing… On and on. This means the item can be found by others who search based on what they’re looking for not necessarily solely on the structure of the item. The old-school method provides a single taxonomic pathway (Project Document > Overview). The new-school method enables a much more nuanced approach. When the Engineer looks for documents relating to soil samples, the item is returned. When the Construction Foreman looks for documents relating to Work Breakdown, the item is returned.

As I mentioned earlier, the ontology (AKA information classification model) is comprised of a set of terms and rules, which have the ability to be maintained as needed by the information or records management SME within your organization or through TEAM. By utilizing the information model on the search side of the equation, it allows the use of “semantically enhanced” search capabilities including a “search as you type” feature as well as the ability to browse through the model in an interactive graphical manner. Both methods create easier, faster, and more intelligent pathways for users to find the content they’re looking for in the system.


Why is this important for businesses?

Help Your Contributors.

There’s a lot of room for human error when a document is manually classified. TEAM’s Intelligent Content solution saves the content contributor time and effort by automatically tagging newly stored content. This ensures that every time new content is stored in any department of your business, its classification will be consistent and no longer susceptible to the vagueries of human interpretation.

Help Your Users.

Will the end-user always know what key words to search for when looking for a specific document? The auto classification system makes finding your documents faster and easier than ever. What could potentially take hours to locate within a large system can now be found in a matter of seconds due to the unique ontology model utilized by Intelligent Content.

Help Your Business.

By changing the way your content is cataloged and managed, TEAM’s Intelligent Content solution is a bottom line contributor to the overall enhancement of your business.

While this sounds like a sales pitch – and I admit it kind of is – I want you to understand that we’re also incredibly excited about the results we’re already seeing from Intelligent Content; better classification, less human error, simpler contribution experience, and far faster and more accurate searching. This is the next step in the evolution of enterprise content management. If you’re interested in learning more, you can check out our YouTube video on this topic or email us directly with your questions.

Using Enterprise Manager for Troubleshooting and Optimizing your WebCenter Content Deployment

May 10, 2016

Raoul Miller – Enterprise Architect

When Oracle WebCenter Content made the architectural shift from a standalone J2SE application to a managed application running in WebLogic Server (WLS), the change provided a number of new capabilities for management, integration, and support.  One of these capabilities is the version of Enterprise Manager that is built into WLS which allows administrators to monitor many different aspects of the WebCenter Content application.

If you haven’t been through formal WLS or Enterprise Manager training, the interface may seem complex or confusing.  My speaking session at Collaborate 2016 in April explained how to use Enterprise Manager to monitor, optimize, and troubleshoot your WCC deployment(s) and I wanted to accompany that with a post here to provide a bit more context.

First a little background – there are multiple versions of Enterprise Manager (EM), and it’s important to be clear which one we are talking about.  Those of us who have worked with the Oracle Database will be familiar with the original EM that’s been used to manage databases since version 9i.  This is now specifically called Enterprise Manager Database Control.

At the other end of the spectrum there is the full-featured Enterprise Manager platform.  This is a multi-tier set of applications which monitor and manage all aspects of your Oracle hardware and software deployment.  We recommend it highly for large Enterprise clients, but it can be expensive and complex.

In the middle is the Enterprise Manager we will discuss today which is a set of web-based tools used to manage and monitor your WLS application server deployments.  You access this at almost the same URL as the WLS administration interface – http://<WLS servername>:7001/em – note the /em rather than /console for WLS, and it’s possible you may not be using the standard 7001 port.

Your initial screen will show you what is deployed to your domain and whether the applications / servers are running or not.


You’ll notice that there are lists of application deployments and managed servers within the domain and right clicking on any of these will show you custom actions for each.


Before we get to what to monitor and measure, let’s take a moment to review best practices when we are optimizing or troubleshooting WebCenter Content.  As the java application architecture has stayed much the same over the years, the standard areas to focus on have remained fairly constant.  It cannot be stated strongly enough that it is vital to look at ALL these areas, measure and test performance before making any changes, change one thing at a time, and then re-test and re-measure after making that isolated change.  It’s very much an iterative approach as without data you are just playing around with inputs and outputs to a black box model.

The areas you need to monitor and measure when optimizing or troubleshooting WCC are:

  • Java virtual machine
  • File system
  • Database (metadata and indexing)
  • Network
  • Authentication / authorization
  • Customization / components
  • Hardware


(I have to credit Brian “Bex” Huff and Kyle Hatlestad for their presentations back in the day at Stellent which taught me this approach.)

Enterprise Manager can help you with many of these areas, but not all – you need other tools to look at file system I/O and utilization, network speed and routing, and (non-oracle) hardware.  However, for the other areas, EM can be extremely helpful.  Let’s look at a couple of examples:

JVM metrics

Right click on the managed server instance and select JVM Performance


This brings up a pre-selected set of JVM metrics and a non-standard color scheme.



This will let you monitor the heap and non-heap memory usage in real time.

**TIP** You may see that the heap is smaller than you thought you had set it – I have often seen an issue where there has been confusion over where the non-default maximum and minimum heap sizes should be set.

Lower on the page you’ll see more granular data on JVM threads, objects, etc.



Datasource Metrics

You’ll need to open the metric palette on the right side of the screen and open up the Datasource Metrics folder.


**TIP ** Make sure you choose this rather than the Server Datasource Metrics, because you will need to select the “CSDS domain-level Datasource”.


WebCenter Content Metrics


Navigate to the WebCenter Content Server deployment at the bottom of the folder list in the left hand area:


Select “Performance Summary” and you’ll see a pre-selected set of content-specific metrics in the graph area.  As with all of the other selections, you can add or subtract metrics as you go – this short cut just gives you a good starting point.


We have only scratched the surface here of the capabilities of Enterprise Manager and its use for optimization of WebCenter Content.  For much more information, download to my presentation from Collaborate 2016 or contact us through our website.  We’ll be happy to discuss how we can further help you optimize and troubleshoot your WCC deployments.

Taming the Paper Tiger with Oracle Forms Recognition

April 22, 2016

By: Dwayne Parkinson – Solution Architect

tiger512We all like to believe that technology makes everything somehow better, right? Our parents’ watches tell time and maybe the date while mine gives me the weather, tells me when to get up and exercise, tracks calories, integrates with email and sends text messages. Seemingly everything from our refrigerator to our garage door opener to the latest and greatest ERP system is connected to our phones and devices these days. Yet amidst all this technology and integration, lurking somewhere in the bowels of virtually every company and organization is a massive pile of paper.

They say the first step to fixing a problem is to admit that we have one.  So let’s admit what the problem is: paper.  It used to be that paper went back and forth between companies as a record of the various transactions.  If you placed an order, it was on paper.  When you got a shipment, there was more paper.  When you needed to pay, there was a paper invoice.  And up until recently, when you paid, someone somewhere was potentially issued a paper check. With the advent of the Electronic Data Interchange (EDI), electronic transactions thankfully became the standard – or so we’d like to think.  What’s really happened however is that only those transactions between electronically astute organizations have migrated to EDI, while smaller organizations and those facing significant technology challenges have unfortunately remained largely paper-based.

While many of these smaller organizations have stopped sending physical paper for these transactions, it’s important to recognize that an e-mail with a PDF attachment is still a paper-based transaction in the end.  Ultimately it requires a person somewhere to open the attachment, read it, extract the important information, and then enter that information into the business system.  Due to this process, the end result is that there are very few organizations that are completely free from the shackles of paper.

1461368758_88The obvious solution is to use some kind of scanning and optical character recognition (OCR) to try to automatically import data into the systems.  The problem with this solution is that many existing OCR systems use technology that hasn’t changed in twenty years.  Often enough the legacy processes – defining templates, creating scanning zones, forcing customers to use predefined forms and cryptic barcode solutions – all fail for various reasons.

Oracle Forms Recognition (OFR) approaches the problem of scanning in a very different way.  First of all, the software is designed to simulate what a human might do when looking at a piece of paper.  The first thing a person does is to evaluate the document and figure out what the document is.  Is it a W2?  Is it an invoice? Is it a resume?  OFR does the same thing.  Based on the layout of the document, the actual content, and several other metrics OFR classifies a document automatically.

Once classified, rules are set up to define what various pieces of information look like within that document.  For example, a Social Security Number is always in the same general format; three digits a dash, two more digits and another dash followed by four digits (999-99-9999).  When a person looks for a Social Security Number on a piece of paper they look for a couple of things:

  • They also look for a specific format
  • They look “geographically” in the general area where they expect the social security number to be based on the document type and past experience

OFR does that exact same thing.  Here we are defining a simple rule for a social security number:


Based on that rule OFR will identify candidates on the scanned documents as shown here:


With OFR, rules can be defined to specify formats or to look next to, above or below certain identifiers (i.e “SSN” and “Social Security Number”).

Once the rules are in place, OFR identifies candidate values on the document and OFR is then trained on sample documents so it can learn where to expect to find each value.  This process is known as creating a “learn set”.  Batches of sample documents are scanned and “taught” to OFR so that when it encounters similar documents in the future it will already know how to handle them.

Here we see the evolution from the traditional scanning/OCR model. With the OFR approach it isn’t necessary to define separate templates for each type of document that might come into the company.  Instead a single document class is created to represent a group of information that is needed from a class of documents.  For example, there may be one class for information contained on a W2 tax form and another class for health insurance information retrieved from various health provider forms.  With just two classes defined, OFR can handle all of the variations of W2 forms and all of the healthcare provider forms a company might reasonably encounter.

In the event that OFR encounters a problem such as a light scan or invalid data, there is an intuitive browser-based verification system that allows users to review the exception data and make an informed decision.  OFR can also be configured so that each piece of data it finds is measured against a certainty level.  So whenever OFR is unsure if the data it has is correct (that is, the certainty level is low), the item can be sent into the verification system where a person can review it.  Additionally, as documents go into the verification system they can be flagged to help further train the system so the accuracy of the system continues to improve over time.

Behind all of this technology is a powerful scripting engine that provides the ability to customize the process as needed and integrate with other systems and a host of other standard OCR technologies.  These include optical mark recognition (OMR), barcode recognition, zonal OCR, floating anchors and pre-processing technologies such as box and comb removal.

We’ve seen wild success with our clients through the adoption of modern, powerful and flexible scanning solutions like Oracle Forms Recognition. From relatively simple needs of only several hundred documents a week to much larger operations, OFR and WebCenter Capture can help you evolve your processes and ultimately cage the Paper Tiger.

TEAM Informatics Introduces Their Innovative Product, DOCSConnect for Oracle WebCenter Content and Oracle Documents Cloud Service

October 26, 2015

docsconnectMINNEAPOLIS, Oct. 26, 2015 — Oracle OpenWorld 2015 — TEAM Informatics (“TEAM”), a leading enterprise content management products and service provider and Oracle Gold Partner, has recently released their newest connector, DOCSConnect. The announcement comes from TEAM at Oracle OpenWorld 2015 where they are participating in the event as presenters of two unique sessions in the WebCenter space.

TEAM’s DOCSConnect joins the power of Oracle WebCenter Content 11g (WCC) and the highly developed Oracle Public Cloud offering, Documents Cloud Service. This hybrid enterprise content model provides security, compliance, and data management features with the extensive collaborative capabilities of the cloud. DOCSConnect is the first connector that functions solely with WebCenter Content 11g and Oracle DOCS rather than utilizing a third party installation or interface. TEAM developed DOCSConnect in order to provide a deeply integrated, controlled, and auditable hybrid document system to ensure content could be accessible and editable at all times from any device.

DOCSConnect is an enhancement component within Oracle WebCenter Content 11g. Not only does DOCSConnect provide improved access to enterprise content, it enables an unprecedented level of collaboration and maintains auditable version histories of files uploaded in both WCC and DOCS. DOCSConnect allows WebCenter Content 11g to serve as a Single Point of Truth (SPoT) for all enterprise content while leveraging the burgeoning power of Documents Cloud Service and Oracle’s Public Cloud platform. “Oracle’s cloud products are game-changers for the traditional enterprise software model and our DOCSConnect product is a powerful way to bridge the gap between paradigms. The best of WebCenter merged with the next generation of enterprise capabilities, enables true collaboration for our customers,” said Doug Thompson, CEO of TEAM.

For more information on DOCSConnect, watch their YouTube Video on the product, and visit

About Oracle Open World 2015

Oracle OpenWorld is an annual Oracle event for business decision-makers, IT management, and line-of-business end users. It is held in October in San Francisco, California. The world’s largest conference for Oracle customers and technologists, Oracle OpenWorld San Francisco attracts tens of thousands of Oracle technology users every year.

About TEAM Informatics, Inc.

TEAM Informatics, Inc. ( is an employee-owned, Minnesota-based software products and systems integration firm with a global customer base and offices on three continents. TEAM was formed over 10 years ago and has experienced a sustained aggressive growth rate.

TEAM is an Oracle Software Reseller and a global member of the Oracle Partner Network, specializing in areas such as WebCenter Content, WebCenter Portal and Oracle Documents Cloud Service. Offerings include professional services, managed services, enterprise and development support, and an expanding set of custom products. In addition, TEAM is a Google Enterprise Partner and Reseller for the Google Search technologies. TEAM’s suite of business applications include a GSA Connector for WebCenter for enterprise search, TEAM Sites Connector for enabling web experience management, DOCSConnect for hybrid enterprise content management, and Intelligent Content for metadata auto-classification.  Get more information on these and all of TEAM’s offerings at


Oracle is a registered trademark of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

To view this video on YouTube, please visit:

Media Contact: Doug Thompson, TEAM Informatics, Inc., 1.651.760.4802,

TEAM Speaking Sessions at Oracle OpenWorld 2015 and New Product Announcements

October 15, 2015

TEAM has had a very exciting year, and we want to share our innovation with the Oracle Community. TEAM has two very unique and important speaking sessions at Oracle OpenWorld 2015 and we want to see you there! Come join us as we make announcements for new and exciting products that enable WebCenter Content and Oracle Documents Cloud implementations.

DIE and ROT or Use Records Management to Control Content Chaos [CON4408]
Dwayne Parkinson, Solution Architect,
TEAM Informatics, Inc.
Thursday, Oct 29, 9:30 a.m. | Moscone West-2024

ContentWorX: Public Sector ECM-as-a-Service Case Study and Oracle Cloud [CON4639]
Volker Schaberg, General Manager Australia and New Zealand,Team Informatics
Thursday, Oct 29, 12:00 p.m. | Moscone West-3000

docsconnectSpeak with us about our newest product, DOCSConnect, connecting Oracle WebCenter Content to Oracle Documents Cloud for true hybrid content management and collaboration. Read more about it here.

Set up a time to speak with us at OpenWorld


What Oracle’s Documents Cloud Service Means for You

April 30, 2015

By: Jon Chartrand – Solution Architect

The sphere of influence that is Enterprise Content Management has been steadily expanding to encompass areas such as records, digital assets, web content, and others. This has meant your ECM solution suite has had to grow and mature to support and maintain these activities. The newest is cloud-based document management, sharing, and collaboration. Now, I bet you’re thinking, “We don’t need that nor do we support that in our enterprise.” Here’s the trick though: Your users are already doing it and they’re very likely making it happen with software that’s not a part of your enterprise ecosystem. That means it’s probably unsupported, potentially insecure, and generally out of your control – not a good combination.

The rapid growth of this field has led to many solutions which attempt to enhance the consumer-level products and businessify them by offering a few more features at a wildly increased price. While these options can seem appealing, they still represent a gap in enterprise coverage as they aren’t themselves enterprise applications. 1Oracle, however, has expanded their Public Cloud offering – already the largest in the world – to not only fill the gap of Enterprise File Sync & Share, but also to expand cloud content management to your on-premises solutions, as well as mesh seamlessly with other applications.Now it’s possible to keep your users happy and productive while maintaining control and even expanding the capabilities of your enterprise. Introducing Oracle’s Documents Cloud Service, also known as DOCS.

DOCS for File Sync & Share

2DOCS represents a trident of capability, the first tine of which is as an enterprise-grade file sync and share replacement for the consumer-grade applications your users may already be utilizing. Before you can sync or share content, however, you have to manage it and Oracle provides a modern, intuitive web interface, for access across every device, to do just that. From here users can upload, preview, revision, delete, and share content and folders with ease making this the front line in our EFSS battle.

On the syncing front, native desktop applications for both Windows and MacOS allows users to seamlessly sync folders of their choosing with the local file system. This means files are available for viewing and editing when and where users demand them and without the need for an Internet connection. When connectivity is restored the sync application automatically updates the cloud with any changes, removing a step for the user.

3On the sharing front, sharing content internally and externally has been rendered both simple and secure. Internally, named users can be shared to folders as one of four roles; Manager, Contributor, Downloader, or Reader. This means you have control over who has access and what kind of permissions they receive. When sharing to an external, non DOCS, user Oracle has provided several capabilities to make the process simple and safe. First, public link accesses are carefully tracked and an audit trail is provided. Each public link can also be assigned an expiration date so you don’t have to worry about forever managing every link that’s been distributed. Even more, each public link can be created with a required passcode so that even if the link is improperly distributed, the materials remain secure. Finally, each public link can be assigned a role which is granted to those who use it. All these features combine to allow incredibly granular control over who can access what content when and with what privileges.

The last point is for those on-the-go. For mobile users Oracle provides native applications for both Android and iOS which enable feature-parity between the mobile and web platforms. This means users can access their content from virtually any device, at any time, and maintain the full suite of capabilities no matter what method they’re using. This represents an unprecedented level of access to and control over enterprise content for your users.

DOCS for Hybrid Content Management

File Sync & Share is a great step forward in content management, however we’re still potentially left with a cache of content that stands apart from your Enterprise Content repository. DOCS addresses this through a process whereby your ECM repository is “tethered” to your DOCS repository through a 3rd party solution and content is shuttled between the two applications when edits are made, ensuring both repositories have the appropriate version available. This process allows your current ECM solution to remain the single point of truth in your enterprise for all content but enables users to access that content from beyond the firewall in a safe and secure manner.

4The use cases for this method are almost endless but imagine a contract package being worked on by a CMO, a salesperson in the field, and a client with contributor access via a shared link. The CMO, working from within the company, can make edits to the documents and upload them to the ECM system. The salesperson in the field accesses the documents via DOCS and can also make changes and suggestions. As revisions are made, the CMO is kept in the loop as the document updates back to the ECM system as well. Finally, when complete, the client can access the documents, digitally sign them, and upload new versions to DOCS. Within moments of uploading the CMO has access and can move them to the appropriate next step.

Hybrid Content Management takes the premise of EFSS and keeps it a truly enterprise endeavor by ensuring that content is reflective of only one repository. This ensures that all users are working with the same materials without fear of unknown changes or missing versions. It also guarantees that content owned by the enterprise is continually merged into the enterprise so there’s reduced anxiety over content ownership and location.

DOCS for PaaS Application Integration

5Finally, DOCS takes an even longer and wider view of its role in the enterprise by enabling you to integrate other Software as a Service (SaaS) applications. The idea here is that any application to which users are uploading content represents another repository in the enterprise. Why should contracts uploaded to SalesForce live in that application? It’s not a content management application and it doesn’t have the metadata, workflows, and processes that your ECM system has. Documents Cloud Service works to solve this issue by providing a rich API foundation and an accessible embedded interface to allow you to merge applications with it and utilize its capabilities as a content platform. This Platform as a Service (PaaS) functionality allows you to keep your enterprises’ content in a single location – especially if you’re utilizing the Hybrid CM capabilities and merging your DOCS repository with your ECM platform.

6With the embedded interface method you can add a simple iframe to any updateable UI to create an almost seamless merging of the two applications. While it looks like a user is uploading documents to the primary application, in reality they’re uploading to DOCS. With the API method, much more elaborate services can be written to customize the functionality of virtually any application, creating a background integration with Documents Cloud Service that is completely transparent to users. In either case, you’re removing another disparate cache of content and centralizing management into a single location. Ultimately this means less storage overhead for your SaaS applications and more complete control over your enterprise content.

Bringing It All Together

Consider a purchase order document uploaded to a contact entity in SalesForce. Though an integration with Document Cloud Services, the content item is actually seamlessly uploaded to DOCS. With the DOCS repository linked to your on-premises platform, the content is replicated to the appropriate folder in the ECM system and an automatic workflow is started, alerting the Director of Sales to the new purchase order and requesting approval. The Director makes a small edit and approves the content. This sends a notification to the sales agent and ends the workflow. The content, now being newer in the ECM system than on DOCS, then flows outward to the cloud, updating the version there. The sales agent happens to also use the desktop client to sync DOCS content with their laptop and so the version there is updated automatically. On receiving the notification, the agent goes to their Oracle Documents folder on the desktop and opens the purchase order to review the Director’s changes. Satisfied, the agent closes the document and then right-clicks on it to access DOCS’ sharing. The agent creates a public link with downloader privileges and sends this link to the purchaser.

In this scenario, the content is available through the SalesForce site, the DOCS site, the DOCS mobile apps, synced to the desktop, and through the on-premises ECM platform. Instead of having two, three, even four different copies of the content across various systems and on various workstations, all versions are centrally managed and maintained in the system of record. This degree of centralized control is precisely what Enterprise Content Management seeks to achieve and Documents Cloud Services bring us all one step closer to that goal.


Have questions? Want to learn more? Contact TEAM today!


Get every new post delivered to your Inbox.

Join 65 other followers