0COMMENTS0RECOMMENDS

WatsonR : IBM Watson + R Package - Part 1

43
POSTED IN: Building Bridges from R to IBM Watson

WatsonR : IBM Watson + R Package - Part 1

WatsonR is a package for R that helps Users access the power of IBM Watson Developer Cloud

CAUTION: As of 9/9 THE CODE BASE IS WORKING AS SEEN IN VIDEO BELOW - VERY EARLY STAGE

SOME ISSUES INSTALLING PACKAGE IN WINDOWS - SO YOU MAY NEED TO FIDDLE A BIT WITH CODE

HOPE TO GET SOME TESTERS / HELPERS TO TUNE THE INSTALL BY Q4 - AND BUILD OUT VISUALS

Twelve months ago I started writing code in R to connect to IBM Watson Developer Cloud (WDC) Services.

https://github.com/rustyoldrake/R_Scripts_for_Watson

The code is about 30 R programs and R scripts that call the Watson Developer Cloud services - text and speech from R.  It's pretty ham-fisty in some places (my "R" skills are mediocre at best)  but it provides about two dozen practical examples of how Watson can help R users process unstructured data and interface with speech and audio. I also started this blog series "Building Bridges from R to IBM Watson" https://dreamtolearn.com/ryan/r_journey_to_watson video here https://www.youtube.com/c/RyanAndersonTechnical

 

IBM Joins R Consortium

Early June 2016, IBM announced it joined the R Consortium – an open-source foundation launched by the Linux Foundation in 2015 to support the R programming language and its user community. IBM joined Microsoft and R-Studio as platinum members. I had been noodling on unifying the best bits of the code into a package - so when that was announced it got me thinking....

 

Tipping Point - OK Let's Go!

Given there is a pretty decent coverage of R+Watson  on my github code base, and judging by the traffic on the blog - it seems the scripts are useful to some R users.  So in June, the first version of WatsonR was created. 

I've never created a package before, and am unfamiliar with Package creation and CRAN, but the tools in R Studio were pretty good, and worth taking a shot. I hope you'll be patient! :-)

 

 The IBM Watson Developer Cloud (WDC) offers a variety of services for developing cognitive applications. Each Watson service provides a Representational State Transfer (REST) Application Programming Interface (API) for interacting with the service.

 

IBM Bluemix is the IBM open cloud platform (PaaS) that provides mobile and web developers access to IBM software for integration, security, transaction, and other key functions. It is built on Cloud Foundry open source technology.

 

Watson Services & API Keys are created by registering for Bluemix at https://console.ng.bluemix.net/ and then (1) invoking IBM Watson Services (2) generating keys and (3) getting keys into R

 

* EACH WDC API / SERVICE NEEDS ITS OWN KEY *

 

 INSTRUCTIONS FOR GETTING KEYS: See this blog (Part 2) for a step-by-step: https://dreamtolearn.com/ryan/r_journey_to_watson/43

 

 

Focus Areas

Where do we start?   The initial WatsonR functions are below - and rationale for selection:

Service

Include V1?

Benefit / Comments

AlchemyLanguage

Yes

Versatile enrichment of unstructured text

Natural Language Classifier

Yes

Powerful and flexible NLC. Lots of code.

Personality Insights

Yes

Fun and informative.  Good demos.

Tone Analyzer

Yes

Fun and informative.  Good demos.

Speech to Text

Yes

WAV – and can talk to Watson / R is cool

Text to Speech

Yes

Talking computers are cool too

Language Translation

Later

Cool, but not primary use case for R users

Conversation

Later

New Service – API’s evolving fast

AlchemyData News

Later

Fresh News Data is fun to work with

Visual Insights

Later

Start with Text and Audio

Visual Recognition

Later

Start with Text and Audio

Dialog

No

Will do Conversation later

Document Conversion

No

Little benefit for R users

Retrieve and Rank

No

Corpus and complexity

Tradeoff Analytics

No

Little benefit for R users / stats

 

Goals

 

1 - Connect to Watson Developer Cloud APIs

Get the WatsonR and dependencies installed, and connect to the Watson APIs (capture and verify keys and connectivity OK)

2 - Basic Functions - Processing Data

Run some basic functions - for example - on the ALCHEMY Service - extract things like Sentiment and Keywords and Entities "I love strawberry ice cream in Vancouver" which returns, a multitude of info:

  "language": "english",
  "keywords": [
    {
      "relevance": "0.948339",
      "sentiment": {
        "score": "0.704352",
        "type": "positive"
      },
      "text": "strawberry ice cream"
    },
    {
      "relevance": "0.306741",
      "sentiment": {
        "score": "0.704352",
        "type": "positive"
      },
      "text": "Vancouver"
    }

etc...

 

3 - Educate with Demo

Another goal was to 'reduce barriers to entry' for people who use R, but who may not necessarily code in Python, or Node, or are full stack.  The package will include data and examples / demos that leverage some of the past blogs - showing how Watson can transform the unstructured data into something useful and interesting.  In addition to the basic Alchemy enrichment of phrases, some Tone Analysis and Personality Insight data sets are included in the set (making it easy like MTCARS does in other pacakges0;  Also, we're including the Harry Potter Sorting Hat ground truth for the NAtural Language Classifier service (NLC)- most services are 'ready to go' and dont need programming - but the NLC is user configured - so package offers a way to roll your own with no prior experience.


 

Functions in WatsonR

 

Here are the first focus areas and initial thinking on naming conventions for functions. 

  • watson.status - test endpoint connection and verify key
  • watson.keys - enter in keys / authenticate (several ways to handle keys - bias to helping 'newbs')
  • watson.demo - what they will do:
    • alchemy language shows enrichment of COMBINED call for a sentence
    • tone runs 100 samples in table through tone, enriches, and pastes in new table
    • personality insights sends contents of 10 (?) URLs to services, and pastes results to table (52 columns)
    • NLC - takes the sorting hat ground truth from NLC/Harry potter demo - TRAINS A NEW CLASSIFIER, let's user monitor training time/status, and then runs tests against the new model
    • speech to text (STT) - sends up an example WAV file and provides transcript
    • text to speech (TTS) - sends a transcript to watson, and plays back the WAV file

Function Names

  • watson.alchemy
  • watson.tone
  • watson.pi
  • watson.nlc
  • watson.stt
  • watson.tts

 

Summary

As of August 16th, 2016 the Watson R package is still in Alpha;  Not available on CRAN but is available on GITHUB.

https://github.com/rustyoldrake/WatsonR

 

 

 

 

Other Packages

In August 2016 Columbus Collaboratory, released CognizeR, an open-source R extension to help access IBM Watson

https://www.ibm.com/blogs/watson/2016/08/accelerating-data-scientist-access-watson-cognizer/

http://columbuscollaboratory.com/News/entryid/25/cognitive-press-release

There is some overlap with WatsonR - and it's worth a look especially if you are working with VISUAL - Visual Recognition services (as of August 2016, my code base does not do any images / vision access - it's all speech and language)

 

 

 

Caveat: The contents of this blog are my own opinion and do not necessary reflect my employer's views.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Before you can comment, you need to sign-up or login

About the Author

Ryan Anderson

Ryan Anderson

Hi! I like to play with data, analytics and hack around with robots and gadgets in my garage. Lately I've been learning about machine learning.

About this blog

This is an informal blog that explores tools, code and tricks that group members have developed to engage IBM Watson cognitive computing services - from the R Programming Language. Packages include RCURL to access Watson APIs - for services that include Natural Language Classifier and Speech to Text. THIS IS MY PERSONAL BLOG - it does not represent the views of my employer. Code is presented as 'use at your own risk' (it has lots of bugs)

Created: September 13, 2015

English

Up Next