VistA, MUMPS and NoSQL

[Extra disclaimer: I am not currently involved in any deliberations with regard to AHLTA or VistA or iEHR, and have no particular knowledge of current thinking on EHR inside the Pentagon. I briefed the Hon. Beth McGrath on Open Source Software in April 2011 – a long time ago.]

As some of you probably know, Jon Stewart of the Daily Show recently ran a clip about AHLTA vs VistA. (embedded below for convenience)

One of the interesting side-stories from this is the open-source vs proprietary angle which Stewart did not address.  The VA was (is?) a big supporter of Open Source Software.  (It is less clear how much commitment remains, now that both Roger Baker and Peter Levin have left the VA.)

An Open Source Software model makes perfect sense for VA. They had the business problem of 133 treatment facilities with 133 custom versions of VistA.  One of the beautiful things about VistA was that it was open to customization at the various installations so that it could be tailored to specific needs and this allowed innovation at point of service.  The problem is that the overall operational cost increases over time as the versions diverge, and there is the possibility that interoperability decreases.  The VA created OSEHRA to bring these forks back together.  By using open source methodologies, they created a common “VA Enterprise VistA” baseline, gives a place for all those local improvements to be merged into a best-of-breed VistA.

This collaborative model is a huge win for innovation and continuous modernization.  Allowing local innovation is both good and bad: the innovation part is good, but the loss of standardization is bad.  Coupling a tolerance for local innovation with a enterprise reference standard makes the prospect win-win.

I’ve personally fixed three bugs in the Linux kernel, and successfully gotten those patches incorporated into the Torvalds-approved kernel.org kernel. I fixed the bugs because they affected me personally.   I worked to get them into the upstream because I didn’t want to have to keep re-applying my patches every time a new kernel was released.  There is also a non-trivial pride and ego-boost associated with that accomplishment.  These same incentives will cause VA doctors and IT staff to work to improve VistA locally, and also to merge those changes into an enterprise baseline.

These arguments in favor of an open-source development model could apply to the DoD also, or to a joint DoD-VA approach.

What’s not clear to me is how much VistA should play a starring role in that future.

Here’s an issue:  building a vibrant, collaborative community around a software development project is about people and process as much as it is about technology.  Some great books have been written about this.  It will be difficult to get bright, talented software engineers to work on VistA, because of the 30-year old tools and design practices.  I have two degrees in computer science from MIT and you couldn’t pay me enough to work on VistA.  Literally.  Even if you aren’t a software engineer, take a look at this random sample of VistA source code, which is barely distinguishable from line noise:

GMRCEDT1 ;SLC/DCM,JFR - EDIT A CONSULT AND RE-SEND AS NEW ;08/20/09  12:16
 ;;3.0;CONSULT/REQUEST TRACKING;**1,5,12,15,22,33,47,66**;DEC 27, 1997;Build 30
 ;
 ; This routine invokes IA #2638 (^ORD(100.01,), #3991 (ICDAPIU), #10117 (VALM10)
 ;                         #10103 (XLFDT), #10104 (XLFSTR), #10060 (access ^VA(200)), #2056 (GET1^DIQ)
 ;
EN(GMRCO) ;GMRCO=IEN of consult from file 123
 ;GMRCSS=To Service   GMRCPROC=Procedure Request Type
 ;GMRCURG=Urgency     GMRCPL=Place Of Consultation
 ;GMRCATN=Attention   GMRCINO=Service is In/Out Patient
 ;GMRCPNM=Patient Name  GMRCDIAG=Provisional Diagnosis
 ;GMRCERDT=Earliest Appr. Date
 ;GMRCPROS=Prosthetics Service y/n
 N GMRCSS,GMRCPROC,GMRCURG,GMRCPL,GMRCATN,GMRCINO,GMRCDIAG,LN,GMRCRESP,GMRCERDT,GMRCPROS,IDX
 K ^TMP("GMRCR",$J,"ED") S GMRCLNO=1
 I $L($P(^GMR(123,+GMRCO,0),"^",12)) S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)="  CURRENT STATUS: (Not Editable): "_$P(^ORD(100.01,$P(^(0),"^",12),0),"^",1),GMRCLNO=GMRCLNO+1
 S GMRCD=0 F  S GMRCD=$O(^GMR(123,+GMRCO,40,"B",GMRCD)) Q:'GMRCD  S GMRCDD="" F  S GMRCDD=$O(^GMR(123,GMRCO,40,"B",GMRCD,GMRCDD)) Q:'GMRCDD  D
 .I $P(^GMR(123,+GMRCO,40,GMRCDD,0),"^",2)=19 S LN=0 D
 ..N GMRCPERS,GMRCTX
 ..I '$D(^GMR(123,+GMRCO,12)) D
 ...S GMRCPERS=+$P($G(^GMR(123,+GMRCO,40,GMRCDD,0)),"^",5)
 ...S GMRCPERS=$$GET1^DIQ(200,GMRCPERS,.01)
 ..I $D(^GMR(123,+GMRCO,12)) D
 ...I $P(^GMR(123,+GMRCO,12),U,5)="P" D
 ....S GMRCPERS=$P($G(^GMR(123,+GMRCO,40,GMRCDD,2)),U,1)
 ...I $P(^GMR(123,+GMRCO,12),U,5)="F" D
 ....S GMRCPERS=$P($G(^GMR(123,+GMRCO,40,GMRCDD,0)),U,5)
 ....S GMRCPERS=$$GET1^DIQ(200,GMRCPERS,.01)
 ..S GMRCTX="  CANCELLED BY (Not Editable): "_GMRCPERS
 ..S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=GMRCTX,GMRCLNO=GMRCLNO+1
 ..S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)="  CANCELLED COMMENT (Not Editable):",GMRCLNO=GMRCLNO+1
 ..S LN=$O(^GMR(123,+GMRCO,40,GMRCDD,1,LN)) Q:LN=""!(LN?1A.E)  I $L(^GMR(123,+GMRCO,40,GMRCDD,1,LN,0))>75 S FLG=1 D WPSET^GMRCUTIL("^GMR(123,+GMRCO,40,GMRCDD,1)","^TMP(""GMRCR"",$J,""ED"")","",.GMRCLNO,"",FLG)
 ..I '$D(FLG) S LN=0 F  S LN=$O(^GMR(123,+GMRCO,40,GMRCDD,1,LN)) Q:LN=""!(LN?1A.E)  S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=^(LN,0),GMRCLNO=GMRCLNO+1
 ..S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)="",$P(^(0),"-",79)=""
 ..S GMRCLNO=GMRCLNO+1
 ..Q
 .Q
 S GMRCSS=$S($D(GMRCEDT(1)):GMRCEDT(1),1:$P(^GMR(123,+GMRCO,0),"^",5)_U_$P(^GMR(123.5,$P(^GMR(123,+GMRCO,0),"^",5),0),U))
 S GMRCPROC=$S($D(GMRCED(1)):GMRCED(1),1:$P(^GMR(123,+GMRCO,0),"^",8)_U_$$GET1^DIQ(123.3,+$P(^GMR(123,+GMRCO,0),"^",8),.01))
 I $D(GMRCED(2)) S GMRCINO=GMRCED(2)
 I '$D(GMRCINO) S GMRCINO=$P(^GMR(123,+GMRCO,0),U,18)_U_$S($P(^(0),U,18)="I":"Inpatient",1:"Outpatient")
 S GMRCURG=$S($D(GMRCED(3)):GMRCED(3),1:$P(^GMR(123,+GMRCO,0),"^",9)_U_$$GET1^DIQ(101,+$P(^(0),"^",9),1))
 S GMRCPL=$S($D(GMRCED(4)):GMRCED(4),1:$P(^GMR(123,+GMRCO,0),"^",10)_U_$$GET1^DIQ(101,+$P(^(0),U,10),1))
 S GMRCPROS=$G(^GMR(123.5,$P(GMRCSS,U),"INT")) I $G(GMRCPROS)="" D
 .S GMRCERDT=$S($D(GMRCED(5)):GMRCED(5),1:$P(^GMR(123,+GMRCO,0),"^",24)_U_$$FMTE^XLFDT($P(^GMR(123,GMRCO,0),U,24)))
 S GMRCATN=$S($D(GMRCED(6)):GMRCED(6),1:$P(^GMR(123,+GMRCO,0),"^",11)_U_$$GET1^DIQ(200,+$P(^(0),U,11),.01))
 I '$D(^GMR(123,GMRCO,30.1)) D
 . I $D(GMRCED(7)),$L($P(GMRCED(7),U,2)) D  Q
 .. S GMRCDIAG=$P(GMRCED(7),U)_" ("_$P(GMRCED(7),U,2)_")"
 . S GMRCDIAG=$S($D(GMRCED(7)):GMRCED(7),1:$G(^GMR(123,+GMRCO,30)))
 I $D(^GMR(123,GMRCO,30.1)) D
 . I $D(GMRCED(7)),$L(GMRCED(7)) D  Q
 .. S GMRCDIAG=$P(GMRCED(7),U)_" ("_$P(GMRCED(7),U,2)_")"
 . S GMRCDIAG=$G(^GMR(123,+GMRCO,30))
 . I '$$STATCHK^ICDAPIU(^GMR(123,GMRCO,30.1),DT) D
 .. S GMRCDIAG=GMRCDIAG_"   "
 S GMRCREQ=$S(+$P(^GMR(123,+GMRCO,0),U,17)="P":"Procedure",1:"Consult")
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)="SENDING PROVIDER (Not Editable): "_$S($P($G(^GMR(123,+GMRCO,12)),U,6):$P(^GMR(123,+GMRCO,12),U,6),$P(^GMR(123,+GMRCO,0),"^",14):$$GET1^DIQ(200,+$P(^GMR(123,+GMRCO,0),"^",14),.01),1:"UNKNOWN"),GMRCLNO=GMRCLNO+1
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)="REQUEST TYPE (Not Editable): "_GMRCREQ,GMRCLNO=GMRCLNO+1
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=$$REPEAT^XLFSTR("-",79),GMRCLNO=GMRCLNO+1
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)="  TO SERVICE (Not Editable): "_$P(GMRCSS,U,2) S GMRCLNO=GMRCLNO+1
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=" ",GMRCLNO=GMRCLNO+1
 S IDX=1,^TMP("GMRCR",$J,"ED",GMRCLNO,0)=IDX_" PROCEDURE: "_$P(GMRCPROC,U,2)
 D:+GMRCPROC RVRS(GMRCLNO,$D(GMRCED(IDX))) S GMRCLNO=GMRCLNO+1,IDX=IDX+1
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=IDX_" Performed as INPT OR OUTPT: "_$P(GMRCINO,U,2) D RVRS(GMRCLNO,$D(GMRCED(2))) S GMRCLNO=GMRCLNO+1,IDX=IDX+1
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=IDX_" URGENCY: "_$P(GMRCURG,U,2) D RVRS(GMRCLNO,$D(GMRCED(3))) S GMRCLNO=GMRCLNO+1,IDX=IDX+1
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=IDX_" PLACE OF CONSULTATION: "_$P(GMRCPL,U,2) D RVRS(GMRCLNO,$D(GMRCED(4))) S GMRCLNO=GMRCLNO+1,IDX=IDX+1
 I $G(GMRCPROS)="" S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=IDX_" EARLIEST APPROPRIATE DATE: "_$P(GMRCERDT,U,2) D RVRS(GMRCLNO,$D(GMRCED(5))) S GMRCLNO=GMRCLNO+1,IDX=IDX+1
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=IDX_" ATTENTION (CONSULTANT): "_$P(GMRCATN,U,2) D RVRS(GMRCLNO,$D(GMRCED(6))) S GMRCLNO=GMRCLNO+1,IDX=IDX+1
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=IDX_" PROVISIONAL DIAGNOSIS: "_GMRCDIAG D RVRS(GMRCLNO,$D(GMRCED(7))) S GMRCLNO=GMRCLNO+1,IDX=IDX+1
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=IDX_" REASON FOR REQUEST:" D RVRS(GMRCLNO,$D(^TMP("GMRCED",$J,20))) S GMRCLNO=GMRCLNO+1,IDX=IDX+1 D
 . I $D(^TMP("GMRCED",$J,20)) D  Q
 .. N ND S ND=0
 .. F  S ND=$O(^TMP("GMRCED",$J,20,ND)) Q:'ND  D
 ... D KILL^VALM10(GMRCLNO)
 ... S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=^TMP("GMRCED",$J,20,ND,0)
 ... S GMRCLNO=GMRCLNO+1
 . N ND S ND=0
 . F  S ND=$O(^GMR(123,+GMRCO,20,ND)) Q:ND=""  D
 .. S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=^GMR(123,+GMRCO,20,ND,0)
 .. S GMRCLNO=GMRCLNO+1
 .Q
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)="",GMRCLNO=GMRCLNO+1,^TMP("GMRCR",$J,"ED",GMRCLNO,0)=IDX_" COMMENT(S): (Add Only)" D RVRS(GMRCLNO) S GMRCLNO=GMRCLNO+1
 I $D(^TMP("GMRCED",$J,40)) D
 . D KILL^VALM10(GMRCLNO)
 . S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)="  New Comment:",GMRCLNO=GMRCLNO+1
 . N ND S ND=0 F  S ND=$O(^TMP("GMRCED",$J,40,ND)) Q:'ND  D
 .. S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=^TMP("GMRCED",$J,40,ND,0)
 .. S GMRCLNO=GMRCLNO+1
 N GMRCEDCT
 S GMRCD=0,GMRCEDCT=0 F  S GMRCD=$O(^GMR(123,+GMRCO,40,"B",GMRCD)) Q:'GMRCD  S GMRCDD="",GMRCDD=$O(^GMR(123,+GMRCO,40,"B",GMRCD,GMRCDD)) Q:'GMRCDD  D
 .I $P(^GMR(123,+GMRCO,40,GMRCDD,0),"^",2)=20 S LN=0,GMRCEDCT=GMRCEDCT+1,GMRCEDCM(GMRCEDCT)=GMRCDD D
 ..S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)="",GMRCLNO=GMRCLNO+1,^TMP("GMRCR",$J,"ED",GMRCLNO,0)="ADDED COMMENT (Not Editable) Entered: "_$P($$FMTE^XLFDT($P(^GMR(123,+GMRCO,40,GMRCDD,0),"^",1)),"@",1)
 ..S GMRCRESP=$S($L($P($G(^GMR(123,+GMRCO,40,GMRCDD,0)),"^",5)):$$GET1^DIQ(200,$P(^GMR(123,+GMRCO,40,GMRCDD,0),"^",5),.01),$L($P($G(^GMR(123,+GMRCO,40,GMRCDD,2)),"^",1)):$P(^GMR(123,+GMRCO,40,GMRCDD,2),"^",1),1:"UNKNOWN")
 ..S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=^TMP("GMRCR",$J,"ED",GMRCLNO,0)_" BY: "_GMRCRESP,GMRCLNO=GMRCLNO+1
 ..;S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=^TMP("GMRCR",$J,"ED",GMRCLNO,0)_" BY: "_$S($L($P(^GMR(123,+GMRCO,40,GMRCDD,0),"^",4)):$P(^VA(200,$P(^GMR(123,+GMRCO,40,GMRCDD,0),"^",4),0),"^",1),1:"UNKNOWN"),GMRCLNO=GMRCLNO+1
 ..S LN=$O(^GMR(123,+GMRCO,40,GMRCDD,1,LN)) Q:LN=""!(LN?1A.E)  I $L(^GMR(123,+GMRCO,40,GMRCDD,1,LN,0))>75 S FLG=1 D WPSET^GMRCUTIL("^GMR(123,+GMRCO,40,GMRCDD,1)","^TMP(""GMRCR"",$J,""ED"")","",.GMRCLNO,"",FLG) Q
 ..S LN=0 F  S LN=$O(^GMR(123,+GMRCO,40,GMRCDD,1,LN)) Q:LN=""!(LN?1A.E)  S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=^(LN,0),GMRCLNO=GMRCLNO+1
 ..Q
 .Q
 S ^TMP("GMRCR",$J,"ED",GMRCLNO,0)=""
 K FLG
 Q
RVRS(LINE,EDITED) ;reverse video for fields that can be edited
 I '$G(EDITED) D CNTRL^VALM10(LINE,1,1,IORVON,IORVOFF) Q
 D CNTRL^VALM10(LINE,1,1,IORVON_IOINHI,IORVOFF_IOINORM)
 Q

Not all the VistA code is this bad; some of it is worse.

That said, the DoD alternative (AHLTA) – and in fact the most well-known proprietary alternative  – all share the same obscure programming language (MUMPS) and are almost certainly equally bad – but since their code is not public, it’s harder to critique them.  In fact, I would expect them to be worse since they were not written with collaborative development in mind.

Say all the nice things about MUMPS you want:  In the end, the choice of an ugly, archaic technology will decrease interest in any project by prospective contributors, thus decreasing the value of the collaborative model.  This is Technical Debt we may not be able to repay.

Interestingly, Philip Newcomb, CEO of The Software Revolution Inc., has asserted that his company’s technology could convert VistA from MUMPS to J2EE in about a year for $10m. If true, this would be a bargain, IMHO.  I’ve met with Phil twice in the past decade, and his claims are impressive, although I have no personal experience with the results of his company’s work.  Other companies perform similar services – Hatha Systems for example claims to do automated analysis of MUMPS.

Tom Munnecke, one of the original architects of VistA has eloquently defended the MUMPS database design, pointing out that medical data is rarely suited to the structured, SQL-esque approach of relational databases.  He makes some fabulous points, and I actually think that he’s right in that many of the decisions that were made for MUMPS and VistA were remarkably prescient.  It’s taken the rest of the IT world three decades to rediscover document-structured non-relational databases, now known collectively as NoSQL.  Now that there’s a huge-amount of energy and expertise focused on large-scale non-relational data stores, maybe we should consider how to use that talent and energy for EHR?

In short, I think the VA should keep doing the OSEHRA thing to consolidate and modernize VistA.  There’s two threads to that:  first, they need to consolidate into an enterprise version of VistA for the VA to bring together the forks (which they are doing), and second, they should refactor and modernize the architecture and the tooling (which they claim they are doing).  I am suspicious that they aren’t being bold enough, but I don’t know.

I remain skeptical that VistA can survive in the long term as a vibrant, community-driven open source project, if it continues as it is.  In order to make VistA a viable project, the current MUMPS-based database need  to be replaced with a modern NoSQL datastore of some kind, and the hyper-abbreviated MUMPS code needs to be replaced with something readable and maintainable.  A colleague of mine (David Wheeler) once pointed out that MUMPS code doesn’t have to be unreadable, but the coding practices of VistA do not lend themselves to readability.  A bold step would be to re-write the thing in Java or some other modern language; a mimimalist step would just re-write it in MUMPS that doesn’t suck so much.  In  David’s words:

I think you should note another alternative as well: Keep MUMPS, but translate the current MUMPS into readable code.

Yes, you’d still be working in an uncommon language.  But that transformation would be especially trivial to do (and trivial to automate), and the risks from auto-translation would be far lower because the underlying environment and assumptions would be unchanged.

It seems to me that the big problem here isn’t really MUMPS, it’s the way MUMPS has been used.  Developers have used MUMPS’ “you can abbreviate anything” combined with “use bad names”, which perhaps made sense 30 years ago but is a bad idea today.  But you do not *HAVE* to create ugly code in MUMPS.

Using the Wikipedia MUMPS article example, here’s some line noise:

hello() w "Hello, World!",! q

But here’s legal MUMPS – it’s the same code, but unencrypted:


hello()
    write "Hello, World!",!
  quit

I have no idea if translating to another language would be a better trade-off than translating it to readable MUMPS.  But it’d be easy to hire somebody to briefly investigate the options, pros, and cons, so that a reasonable decision (based on EVIDENCE) could be made.  And I think, in fairness, that alternative should be considered.

I also have concerns about the governance model of OSEHRA, but, as this blog post is already too long, I’ll save that for another article.

As promised at the top: Jon Stewart on AHLTA vs VistA:

24 thoughts on “VistA, MUMPS and NoSQL

  1. Rob Tweed

    In general I agree with your conclusions. However the mistake everyone makes with Mumps is the conflation of its language and database. Critics of Mumps always focus on the arcane and outdated nature of the language and fail to recognise that the thing that has made Mumps applications so successful in healthcare is the database. They also fail to recognise that all the main implementations of Mumps (GT.M and Cache) make it possible to use alternative languages.

    It turns out that JavaScript, running on the server on the Node.js platform, is a perfect bedfellow for the Mumps database, and can even interoperate with existing legacy code written in the Mumps language. This, in my opinion, is the future for healthcare applications, exploiting the unique power and flexibility of the Mumps database – a NoSQL database that pre-dated the NoSQL movement by several decades, and proven to be ideally suited for the rigours of the healthcare environment. The social media and games-focused mindsets of the new generation of JavaScript developers are just what’s needed to inject fresh thinking into the problems of EHR design and healthcare-related big data visualisation. Conversely, it’s clear that it’s going to be well nigh impossible to recruit and retain new developers to learn and develop in the Mumps language, so tapping into the growing community of JavaScript developers is key to the ongoing support and maintenance of the many huge legacy EHRs such as VistA.

    I’ve written at length about this in my many blog articles at http://robtweed.wordpress.com. For starters I’d point readers at the following postings, but please browse extensively:

    http://robtweed.wordpress.com/2013/01/22/can-a-phoenix-rise-from-the-ashes-of-mumps/
    http://robtweed.wordpress.com/2013/01/23/can-a-phoenix-arise-from-the-ashes-of-mumps-part-2/
    http://robtweed.wordpress.com/2013/01/24/a-phoenix-rises/
    http://robtweed.wordpress.com/2013/01/26/to-the-node-js-community-healthcare-needs-your-help/

    See also:

    https://www.vxvista.org/display/vx4Learn/GT.M+interface+to+NodeJS
    http://opensource.com/education/13/1/nodejs-nosql-m-healthcare-it

    Rob

    Reply
    1. Dan Post author

      Thanks for your thoughts, Rob.

      I was aware of the distinction between MUMPS-the-language and MUMPS-the-database. Perhaps unfairly, I dismissed both as archaic. If I’d known of a GT.M to Node.js interface, I might well have been more gracious. Even so, I think it’s hard to argue with the developer-mindshare of MongoDB or Cassandra, compared to the M database.

      I almost advocated Node.js in my post, but I held back a little. In my experience, Node.js is a little immature: all four of my Node.js apps broke recently on the v.0.10 upgrade. (q.v. https://github.com/developmentseed/node-sqlite3/issues/126, http://geoff.greer.fm/2012/06/10/nodejs-dealing-with-errors/)

      Considering the “enterprisey” nature of EHR, I reluctantly concede that Java is a more likely choice. (I’ve never cared much for Java – admittedly for irrational reasons.) EHR for large enterprises (i.e. Defense and VA health) is not a welcoming environment for small-scale innovation. It’s not so easy to just stand up an instance of something like VistA to understand and hack on it. (This is both a weakness and strength.)

      On the gripping hand, the webinar from Luis Ibáñez is extremely persuasive to me. The fact that Kitware and OSEHRA are already working the seams between MUMPS and the modern world is incredibly encouraging.

      Reply
      1. Rob Tweed

        Yes, it’s a common mistake to believe that just because the Mumps language is outdated, it also applies to the Mumps database. You may find this paper that I co-authored some time ago useful in this regard:

        http://www.mgateway.com/docs/universalNoSQL.pdf

        The significant advantage that the Mumps database has over NoSQL databases such as MongoDB and Cassandra is its multi-facetted capability. See:

        http://robtweed.wordpress.com/2013/03/26/mumps-the-proto-database-or-how-to-build-your-own-nosql-database/

        As far as performance is concerned, both GT.M and Cache are orders of magnitude faster than MongoDB. Indeed I ran some benchmarks a couple of years ago comparing the Mumps database in Cache + Node.js against Redis + Node.js – despite Redis being generally assumed to be the fastest thing on the planet, Cache/Mumps gave it a serious spanking.

        Get over the image problem caused by its language, and you’ll find that the database behind VistA is really something special and just as relevant today as it has ever been.

        Rob

        Reply
      2. Luis Ibanez

        Dan,

        Thanks for a well-thought article.

        I concede that, to the untrained eye,
        the M code can appear unreadable.

        But the same is true for Perl and Assembly… 🙂

        and depending on who we ask,
        C++ Template Meta-Programming can be quite daunting too…

        I lean towards thinking that Training and Education are a better and
        cheaper solution that a full rewriting of a working system. Starting
        from the fact, that even if one wanted to rewrite such system, there
        will still be a need for qualified developers who could read and
        understand the current code that they would be porting.

        So,
        if the problem is that not enough people know the language…
        well…, maybe we should teach the language 🙂

        It certainly will cost less that $16B (which the estimated cost of a
        VistA replacement) to train a new generation of developers who do
        understand the language, maybe even with a more modern twist to it,
        and can now maintain and improve the system.

        A VistA community will need 5,000 developers (to have the equivalent
        developer / lines-of-code ratio of the Linux Kernel).

        Therefore, as long as we can train M developer for less than $3M per
        developer… then: Education is always cheaper and more effective
        than system replacement. This without even taking into account the
        network effects that are generated when a critical mass of developer
        become proficient.

        Let’s keep in mind that a typical undergrad engineering college
        education in the US will cost around $160K per student.

        Following that line of thought,

        Here are some links on the Educational activities that we have been
        engaged on since 2011 around M, VistA and the EWD web framework
        on higher education.

        These activities have taken place at the Rensselaer Polytechnic Institute
        and the State University of New York at Albany (SUNY-Albany).

        Coincidentally,

        This Thursday is the 3rd Open Source Festival at SUNY-Albany
        http://ocs.opensourcefestival.org/index.php/osf/osf13/schedConf/schedule
        http://www.kitware.com/blog/home/post/473

        Which will include several talks related to M and VistA:
        http://ocs.opensourcefestival.org/index.php/osf/osf13/paper/view/13
        http://ocs.opensourcefestival.org/index.php/osf/osf13/paper/view/26

        and this Friday we are hosting a VistA Workshop at SUNY-Albany
        http://www.osehra.org/blog/vista-workshop-suny-albany-friday-april-26th-2013
        where we will be working with students who learn M in the past weeks
        as part of their training on NoSQL databases, and their course of
        Web Development.

        Since I taught the section on M in both classes at SUNY, as well as
        the classes at RPI, I’m happy to report that when presented with
        properly prepared material, the students didn’t have any trouble
        going through an introduction to

        M as a NoSQL hierarchical database,
        http://www.osehra.org/content/practical-introduction-m-database-part-i

        or
        M as a programming Language:
        http://www.osehra.org/content/practical-introduction-m-language-part-i

        here is a combined report:
        http://www.osehra.org/blog/teaching-m-database-and-m-language-suny-albany-spring-2013-part-i

        as well as the EWD web development framework
        that uses M as a backend database:
        http://www.osehra.org/blog/teaching-ewd-suny-albany-spring-2013-part-i

        Students learned M after having seen:
        RDF, SPARQL, Neo4j and MongoDB.

        At SUNY-Albany, we have also prepared material on Node.js + M:
        https://github.com/SUNY-Albany-CCI/open-source-databases-tutorial/tree/master/source/M/Examples/Node.js

        A combination in which one can use the M database from the
        Node.js language (server-side Javascript)
        https://www.vxvista.org/display/vx4Learn/GT.M+interface+to+NodeJS

        This is based on work that Rob Tweed and David Wicksell have
        done an placed in open source projects:
        https://github.com/dlwicksell/nodem
        https://github.com/robtweed/ewdGateway

        We are planning on converting many of these Educational modules
        to be suitable for MOOC courses, during the Summer.

        At the high dissemination rate of MOOCs, we can certainly train
        M developers for much, much less than $3M per person. 🙂

        Here are a few more posts on the promotion of M and Node.js:

        http://opensource.com/health/13/4/nodejs-integrates-m-tutorial-part-i
        http://opensource.com/education/13/1/nodejs-nosql-m-healthcare-it
        http://opensource.com/education/13/1/teaching-open-source-nosql-databases-final-lesson
        http://opensource.com/education/12/11/teaching-open-source-nosql-databases
        http://opensource.com/education/12/11/open-course-open-source-nosql-databases
        http://opensource.com/health/12/7/join-m-revolution-m-and-r-programming-languages
        http://opensource.com/health/12/3/join-m-revolution%E2%80%94get-your-tools
        http://opensource.com/health/12/2/join-m-revolution

        As well as connections to the M database from other
        languages, such as:

        Python
        http://www.osehra.org/blog/gtm-binding-ruby

        and Ruby
        http://www.osehra.org/blog/gtm-binding-python

        Luis

        Reply
    2. Philip Newcomb

      TSRI’s claims to be able to modernize Vista as well as other EMR variants written in MUMPS are based on a series of pilot and research projects TSRI has undertaken since 2005 for the VA, MHS, TATRC, in which we have conducted investigative research to explore technical feasibility, technical challenges associated with conversion to Java and EGL, consolidation of similar and identical code, replacement of architectural features, and gathered metrics to support ROI analysis associated with replacement of MUMPS and its data base with modern alternatives. Our estimate of the approximate cost to modernize Vista and other MUMPS based EMR system is based on cost-accounting analysis drawn from several hundred large information systems that TSRI has modernized with 100% levels of automation. Our research on the feasibility and ROI analysis associated with MUMPS modernization is on-going, and is formally briefed to senior leadership at the MHS and DOD/VA IPO and VA on a quarterly basis.

      TSRI has an excellent track record for modernization of high assurance systems means of a 100% automated model-based and rule-driven code conversion, refactoring and documentation that adheres to the OMG ADM approach. Notable systems modernized by means of this process include the Cobra Dane Ballistic Missile Defense system, Army Field Artillery Tactical Data Systems, Milstar, Global Positioning and Navigation System (GPNTS), to name a few. One of the most significant of the systems TSRI has modernized is the flight operations system known as TopSky,. Topsky is marketed by Thales Air Systems for the European, Asian and Australia air spaces. In October 2011 I received the Stevens Award from the IEEE Working Conference for Reverse Engineering at the University of Limerick, Ireland, in recognition of TSRI’s contribution to the advancement of commercial architecture-driven modernization technologies and practices, and to commemorate the first deployment of TopSky in April of 2011 at Shannon Center in Ireland, where the flight operations software we modernized supports traffic center operations and controls all trans Atlantic air traffic entering Irish air space.

      We are still performing research on a wide array of topics associated with modernization of these kinds of systems, but thus far have not encountered any show stoppers that would contradict our earlier cost predictions. Our current research focus is primarily concerned with (1) optimization of the top-line value of the conversion process to assure the quality and performance of target code and DB will meet TCO cost reduction objectives (2) assurance that all cost factors associated an overall modernization process that encompasses conversion, unification and documentation of multiple large stove pipe systems are fully understood and predictable. To date we have run in excess of 20,000,000 statements from various MUMPS EMR systems through our ADM-driven process (normalizing for the 3 to 1 ratio of MUMPS lines to executable statements). Some of our results will be publicly presented at an upcoming OSEHRA conference.

      Philip Newcomb
      Chief Executive Officer, Chairman, Founder
      The Software Revolution, Inc
      11410 NE 122nd Way, Suite 105 Kirkland, WA 98034
      office: (425) 284 2790
      ——————————————————————-
      TSRI Website: http://www.tsri.com/
      Stevens Awardee: http://www.reengineer.org/stevens/
      Recent Articles:
      • Architecture Driven Modernization – Government CIO http://www.governmentciomagazine.com/2013/01/architecture-driven-modernization
      • The Implications of Lehman’s Laws on IT Priorities in the Wake of the Sequester – Government CIO http://www.governmentciomagazine.com/2013/04/implications-lehman%E2%80%99s-laws-it-priorities-wake-sequester

      Reply
  2. Rama Moorthy

    Dan,

    A note of clarification. Hatha Systems does provide semantic extraction and analysis which can be used as input for translation. We have strong belief and continue to prove that knowledge extraction is the first step and the most critical step in understanding your legacy environment before any migration to the ‘to be’ environment is possible. We currently don’t support MUMPS today …. We support legacy environments and have determined MUMPS is doable, but have NOT YET built the parser required for MUMPS analysis which is not a huge investment. Additionally, the idea that a translation cost of $10M seems a little low based on our modernization experience. There is more to modernization than moving code from one construct to another. Since the move is a paradigm shift, it will take effort… but whether Phil’s solutions or our Knowledge Refinery, they both help significantly reduce cost through bringing automation into the equation. And of course with automation human error and inefficiencies can be reduced dramatically. Thanks.

    Reply
  3. Rama Moorthy

    Good comment from Rob Tweed. I tend to agree with Rob that it is important not to rush into a modernization effort. MUMPS from our analysis does add a lot of value and is obviously useful despite its sluggishness. That said, the MUMPS environment in all its capabilities and flexibilities, has in all likelihood created the largest technical debt of any operational systems in the federal government that I am aware of. Perhaps this claim is strong. But one should consider when a community of contributors are given access to contribute, that there needs to be some rigor and process of acceptance with transparency into the changes made to the system – other than what is in the human brain. People retire and leave. The issue is not to modernize, the issue is to reduce VistA’s technical debt. This requires an understanding of the most current state of the systems…. and hence again… knowledge extraction and analysis… document the whole system. Use tools or not use tools, but it requires it and soon.

    Reply
  4. David Whitten

    I am a co-founder of a volunteer group (WorldVistA) that has spent the last ten years making sure that the VistA system is available to people outside the VA. The code itself, as a Work of the Government is public domain, and has been available to the general public since the early 1980s.

    I am trying to understand some of the terms used in this discussion.
    Specifically, the Wikipedia definition of technical debt (http://en.wikipedia.org/wiki/Technical_debt) is not really applicable to the MUMPS based VistA software.

    Technical debt (also known as design debt or code debt) is a neologistic metaphor referring to the eventual consequences of poor or evolving software architecture and software development within a codebase. The debt can be thought of as work that needs to be done before a particular job can be considered complete. As a change is started on a codebase, there is often the need to make other coordinated changes at the same time in other parts of the codebase or documentation. The other required, but uncompleted changes, are considered debt that must be paid at some point in the future.

    The VistA system has a quite advanced software architecture that has clear modular design, a careful attention to portability and standard MUMPS language usage, a client-server architecture, supporting multiple GUIs, that uses minimal resources, and a strong focus on medical and clinical best practices. It has hundreds of pages of documentation at the end-user level and a high level self-describing , schema driven tool set built on a NoSQL database. The system has many qualities that could be described in a buzz-word compliant fashion and make most folks happy.

    The VistA System is also a system that was originally developed in the late 1970s and has been adapted to modern practices over the last forty years. It is written in a language which took full advantage of minimal resources by having an extremely terse syntax, developed in an environment that was closed off from mainstream computer science, and created by people who were focused on medical state-of-the-art more than computer science state of the art. VistA also has been slowly starved of funds and expertise for almost 20 years yet still has a justified reputation for being the best Hospital Information System for patient care available.

    It seems strange that the very virtues that supported VistA are the ones that many use against it today. VistA and MUMPS use fewer computer resources, CPU and memory and hard disk space than any system I know of.

    I must close this for now, but will follow up this comment at a later time.
    David Whitten
    713-870-3834

    Reply
    1. Dan Post author

      As I understand the term, “Technical Debt” is any technical/engineering work that is recognized as necessary, but which has been deferred, generally for pragmatic reasons. This can be due to poor design choices, changed understanding of requirements, or external factors (i.e. compiler upgrade).

      See http://c2.com/cgi/wiki?TechnicalDebt for much other discussion.

      As an example, another project which I’m involved with (Ozone Widget Framework) was based on ExtJS… which was a brilliant framework to use at the time. ExtJS was *the* premiere framework, and was an immense enabler for OWF. But the years roll on, and one of the primary user complaints has been responsiveness. Detailed performance measures and prototyping showed that ExtJS was the bottleneck, and that refactoring using jQuery, Modernizr, and Bootstrap could significantly improve performance. At that moment, refactoring OWF to remove the dependency on ExtJS became technical debt.

      With respect to VistA, MUMPS may likewise have been a brilliant choice, but as soon as it’s recognized as an albatross, it becomes technical debt. The question then remains: Is MUMPS an albatross?

      I stand by my statement: “Say all the nice things about MUMPS you want: In the end, the choice of an ugly, archaic technology will decrease interest in any project by prospective contributors, thus decreasing the value of the collaborative model. This is Technical Debt we may not be able to repay.”

      Reply
      1. Rama Moorthy

        Dan –

        Thanks for the clarification on technical debt. I unlinked for a number of hours. I want to clarify that all software has some technical debt. One can have technical debt in COBOL, Java, PL-1, C, C#… when software is rushed out the door, there are compromises made which can lead to technical debt increase thus the level of overall sustainment needed overtime to address issues that impact performance, function, etc of the system … not to mention security, assurance and other factors.

        What I was getting at is that knowledge extraction can help better understand the most current state of a software system … in this case VistA. It provides a way to see the detailed design, architecture, and business layer implementation in the most current state of the software system to better understand the burdens carried by the system which have increased that technical debt. The progression of software automated metadata extraction tools allow for this to work. Obviously our solution the Knowledge Refinery does this. sorry for the promotion. There are other tools that have also played in this area and provide various layers of knowledge extraction if not all layers. When dealing with something as vast at VistA understanding what is there is half the battle – and doing it without automation is nearly impossible. Many think that just because something is open source you have the transparency you need. The reality is the knowledge of the system is implemented in the dependencies as well as the artifacts built using the code. Without automated extraction of the metadata you will never truly understand the system for decision making. For example, pulling out a call map can take hours to days or more if done manually if sources are available and without this sort of automation. This sort of understanding allows even MUMPS, which has been perhaps overly criticized, to not carry its proverbial monkey that is on its back. The understanding allows for decisions to be made such as ‘whether one should move out of MUMPS or not’, and ‘can I improve the system or component in its current MUMPS environment through sustainment activities?’. or ‘If I can’t sustain it, then let me look at the translation options I have in front of me.’ Choosing which components need to move to a new environment becomes easier. This process is working well in the world of IBM COBOL, Unisys COBOL, C, Java, and many other languages. Granted MUMPS does have some added uniqueness as we are all aware…but methods have been developed to prove the uniqueness can be addressed with accuracy.

        As a side note: There is approximately 1Trillion lines of COBOL legacy code out there in operation. And there is a lot of technical debt… but understanding through knowledge extraction has given COBOL a new lease in life. That would be in the case of IBM – COBOL, CICS, JCL, DB2 etc and with Unisys – COBOL, WFL, ALGOL, etc… Many choose to keep COBOL alive because it efficient for specific applications… high volume transactions where performance, accuracy and traceability become critical for business systems. And understanding the code through knowledge extraction is allowing them to more cost effectively sustain the code and migrate what requires migration. It becomes a business and a risk decision.

        Anyway… hopefully you get the picture….

        Reply
  5. Philip Newcomb

    David,

    I do not argue with your assessment of the quality of VistA from a medical/clinical perspective. For its time MUMPS provided an excellent highly, flexible and extensible language for developing clinical applications with features that were not available in most languages of its time. Collectively the VA Vista, OpenVista(r), WorldVistA, TC2 and CHCS and the approximately 150 VA and 100 MHS variants of these systems, arguably constitute the most advanced and comprehensive EMR functionality in the world, from a medical and clinical perspective. In addition there are numerous hospitals that have developed proprietary variants of the VistA system.

    However, the theory of programming languages and application design and architecture has advanced significantly since the MUMPS language was invented. Few programmers schooled in modern programming languages would want to program in the language in the insert above, and none would understand it without specialized training. Workforce demographics, availability of trained work force matter to large organizations because it impacts ownership costs. Code comprehensibility matters because it impacts programmer productivity.

    Our research is oriented towards determining if an optimal migration pathway exists that can cost-effectively preserve the functionality of VistA and other EMR systems that were originally expressed in MUMPS while permitting those systems to be reexpressed in languages that can be more easily and cost-effectively maintained by programmers who do not possess MUMPS programming expertise. Our goals include introducing modern language, design and architectural features during the conversion process that that are either not available, or do not appear to be present, or are not currently manifested in these systems.

    In addition we are studying the feasibility of consolidating the various silo’d variants of the VistA system that are extant into a composite product line architecture, by means of a conversion process that can automatically derive common components and distinguish them from variant-unique components.

    Towards this end our conversion and code consolidation process has been applied to the entirety of WorldVista, OpenVista and TC2, with preliminary findings that indicate very high levels of product line consolidation may be possible for WorldVista and OpenVista (close to 58%) due to the very high levels of code consolidation observed. However when code consolidation was applied to WorldVista and TC2 during the conversion process, relatively little (less than 20%) code consolidation was observed– findings that suggest, that these two EMR systems are more unique than similar. It should be noted that the meta-models of TC2 and WorldVista are also much more distinct than the meta-models of WorldVistA and OpenVista, which are very similar.

    Our goal is very simply to provide a means by which these EMR systems, that are wonderful from a clinical and medical perspective, can be fast-forwarded without loss of functionality to take advantage of the many advances in language theory and computing architectures that have been made in the many decades since MUMPS was invented. If this can be accomplished, their operational and maintenance costs will be reduced and more funding will be available to continue their functional enhancement.

    Philip Newcomb
    Chief Executive Officer, Chairman, Founder
    The Software Revolution, Inc.
    11410 NE 122nd Way, Suite 105 Kirkland, WA 98034
    office: (425) 284 2790
    ——————————————————————-
    TSRI Website: http://www.tsri.com/
    Stevens Awardee: http://www.reengineer.org/stevens/
    Recent Articles:
    • Architecture Driven Modernization – Government CIO http://www.governmentciomagazine.com/2013/01/architecture-driven-modernization
    • The Implications of Lehman’s Laws on IT Priorities in the Wake of the Sequester – Government CIO http://www.governmentciomagazine.com/2013/04/implications-lehman%E2%80%99s-laws-it-priorities-wake-sequester

    Reply
  6. David A. Wheeler

    David Whitten said:

    I am a co-founder of a volunteer group (WorldVistA) that has spent the last ten years making sure that the VistA system is available to people outside the VA. The code itself, as a Work of the Government is public domain, and has been available to the general public since the early 1980s.

    And for that I tip my hat. Thanks for your work!!

    “… Technical debt (also known as design debt or code debt) is a neologistic metaphor referring to the eventual consequences of poor or evolving software architecture and software development …

    Right. It’s important to note that in Ward Cunningham’s original formulation of technical debt, not all technical debt is bad. After all, few people can buy a house outright; you often have to accept some debt in order to get code out in time. But just like any debt, if you don’t work to pay the bills, accumulated debt over time can become overwhelming and you end up facing the creditors.

    The VistA System… is written in a language which took full advantage of minimal resources by having an extremely terse syntax, developed in an environment that was closed off from mainstream computer science, and created by people who were focused on medical state-of-the-art more than computer science state of the art. VistA also has been slowly starved of funds and expertise for almost 20 years yet still has a justified reputation for being the best Hospital Information System for patient care available.
    It seems strange that the very virtues that supported VistA are the ones that many use against it today. VistA and MUMPS use fewer computer resources, CPU and memory and hard disk space than any system I know of.”

    Here we get to the nub. Using “fewer computer resources, CPU and memory and hard disk space” is interesting, but in a world where $100 can buy a 2TB drive or a multi-GHz CPU, those are simply not the key measures. You can hire out incredibly fast computers as a service. Clearly adequate response time is needed for users, but any record-keeping software that uses modern technology and runs on modern equipment should work well. I think nowadays a more important metric is the “number of minutes before a developer new to the project and its language can make real improvements to the code”, and it is by that measure that most M-based systems struggle with.

    It’s not clear that the system needs to switch from M/MUMPS. I’ve seen plenty of Perl code that was nearly as hideous as the sample above. But it’ll be hard to attract many new developers, and those developers will be less productive, if the code is that hard to read.

    Code readability is key, regardless of the language. I believe that if you can make code readable to humans, you will significantly improve development productivity. You’ll also improve quality as seen by the customer, since developers will be less likely to make mistakes.

    Reply
    1. Dan Post author

      Speaking of unreadable Perl code, I wrote this a few years ago:

      #!/usr/bin/perl
      @MIT=($*=1);@q=map{-42+ord}split m,,,qq #Just Another Perl Hacker...
      ;/CAJTSX``4*/*0*44/4 ) 4 44)4444/0444E\t1mnEmmE  magnus@MIT.edu  ;;;
      $**=0.1;for$U (0..8){@u=map{$q[$U+$_*9];}(0..4);$u[4]-=67;U(@u);};;;
      grep{print((map$_?q:*::q: :,(@$_)),$/);}@U;sub U{($h,$a,$c,$k,$e,$r)
      = @_;for ($e..$k/$*){$u=$**$_;$U[$a-$**$**$c*$u*$u][$h+$u]=q}q};};};

      (Sadly, this throws a deprecation warning in newer versions of Perl, ironically about a feature that I don’t use in this code.)

      I was a Perl guy for many years, and I defended Perl by pointing out that while it was common to write ugly code in Perl, it was imminently possible to write beautiful code in Perl too. (A lesson taught to me by Wilfredo Sánchez Vega.)

      Eventually I decided that, despite the 20+ years I had invested in Perl, and despite all the things I loved about it, it’s not a good idea to use Perl anymore. And so, I rewrote most of my personal applications in Node.JS.

      I was a PostgreSQL guy too, but I gave it up. I run WordPress, Mediatomb, and Gallery3 and they all use MySQL. For lotsa other stuff, SQLite3 is adequate. Running PostgreSQL became an extra hassle that I just didn’t need, so I refactored it out of all my apps.

      Technical Debt; trying to pay it down. It doesn’t matter that PostgreSQL or Perl were the right answer *then*. They’re the wrong answer *now*, and more wronger tomorrow. Who will make these calls for VistA?

      Reply
  7. Martin Mendelson

    Can we please get one thing straight before any more distortions are entered: the little paragraph from a VistA routine that is cited above as an example of MUMPS code is NOT – I repeat NOT – MUMPS code. It is only version information about the routine and consists solely of comments as denoted by the semicolon at the start of each line. If you want to demonstrate how awful MUMPS code can be, then at least display actual working code.

    Reply
    1. Dan Post author

      Wow… Now that you extensively quoted me poking MUMPS during the WorldVistA Community Meeting, should I take extra precautions for my safety? Change my route to work, etc? How many people now think I’m the devil?

      Reply
  8. Philip Newcomb

    Dan,

    In reference my claim that VistA could be converted for $10M, there have been some interested new developments that further substantiate this claim. In our recent work we have demonstrated 100% automated conversion with no human intervention required whatsoever from MUMPS to JSE/JEE for 47,000 lines of MUMPS . The resultant code has been independently validate and proven to be error free. This represents a 6-sigma level of fidelity in the accuracy of the code conversion technology. 0 errors in 47,000 lines of translated code. We have also succeeded in converting 1.3 Million lines of MUMPS to a cleanly compiling and linking state, also with 0 human intervention (0 errors in 1.3 million lines of compiling and linking code). The JSE/JEE code that is produced by the automated converter is fully compliant with VA/DOD Websphere-based Enterprise Service Bus (ESB) architectural requirements . I.e it is a multi-tier JSE/JEE application with business logic running in a webserver and a web browser providing support for mobile as well as datacenter applications that provides from an operators perspective perfect functional equivalence for all existing user interaction (0 operator retraining) and data base operations. It seamless interoperates with MUMPS code of existing applications, and it employs the existing open source version of Intersystems Cache database, thus achieving DB performance equivalent to, and interoperability with the original MUMPS EMR and EHR systems. This translator behaves like a cross compiler , except that instead of producing assembler for different HW platforms it produces Java that runs in a JVM and targets JEE data center and cloud-based SAAS/PAAS/IAAS (i.e. open architectures) What this means for MUMPS programmers is they can continue to program in MUMPS because the converter can be used as cross-compiler to transition their code to x86 data centers and cloud based architectures, such as the VHA/DOD ESB without any need for them to learn Java or JEE. What it means for the VHA and DoD is it provides an automated incremental transition pathway for migrating all of their MUMPS code to the ESB, and their MUMPS programming staff can happily continue programming in MUMPS while the organizations moves the IT infrastructure and transitions their healthcare applications HW/SW architecture and data centers into the future. What this means for the leading COTS providers of healthcare systems, such as EPIC and McKesson and Cerner, all of whom have systems whose core code is still written in MUMPS and who are vying with each other to sell new EHR and EMR components to the VHA and DoD, is they can rapidly achieve full architectural compatibility with the DOD/VA ESB, which is the CONOPS they must run in, simply by having TSRI cross compile their systems to achieve compliance with the JE/JEE architecture of the ESB. This conversion process unifies code as it converts it, so MUMPS coming from disparate sources is sorted into common libraries wherever duplication is found, and the future architecture remains disciplined, controlled and manageable. The generated Java/JEE code all operates within a common environment, the composition of existing multidimensional data is preserved, and all the systems in the composite product line architecture can seamlessly interoperate with each other. The solution provides several additional benefits. The cross-compiler generates full UML models of the resultant code. The multidimensional meta-models of the systems are analyzed, extracted and expressed as UML class diagrams. The application code is redefined as EGL pseudo code, and resultant application logic is documented with dataflow diagrams, structure charts, cause effect graphs, state transition tables and state machine graphs. Several addition features round out the solution architecture: An external meta-model import mechanism allows mappings to be defined between existing multidimensional data sources so that new data flows can be defined between VHA and DoD or 3rd party multidimensional data bases in the translated code. In the initial solution they cohabitate, all share the same house, and lie in the same bed, but don’t exchange data. (pardon me for not taking the analogy any further). All code in the PLA can share a common operating environment. All generated Java code units exhibit a common two tiered API. Any Java Method derived from MUMPS procedures can be provided with a SOA API. Every MUMPS GLOBAL has dual representation as a Java Data Access Object (DAO) for which a SOA API can be defined. Thus, web service business process execution languages that use discovery services of SOA APIs can be easily added to create new business processes employing finely grained exposed capabilities. The conversion process has been incrementally applied to WorldVista, OpenVista, VHA Vista, TC2 and CHCS as well as to several commercial and proprietary hospital MUMPS systems in a series of scaling demonstrations. Altogether in excess of 30,000,000 statements of MUMPS code has been transformed, and consolidated, refactored with the converter, and 10s of millions of lines of meta-data associated with these system, that define the multi-dimensional databases for their respective medical records systems has documented as UML diagrams and converted into EGL. Approximately 200,000 lines of MUMPS code in modules from both the VHA VistA and MHS CHCS system have been converted along with Fileman and Taskman support code. in a series of pilot activities. Taking the VHA and DoDs combined IT budget of $5 Billion in 2013 as a baseline, the two agencies must have invested at least a $100 Billion in their MUMPS applications since they began operating MUMPS in their data centers in the late 1970s. This is certainly not an investment anyone who is sane would lightly throw away. But, like most large and complex legacy systems, the VA and MHS systems have reached a point in their lifetimes, where accumulated technical debt and other vestiges of their antiquity, makes it desirous to refresh their technical IT architecture without discarding its business functionalities, to enable them to continue to evolve. Automated transition to a composite PLA will enable the best functionality from both the VHA and MHS to be preserved, so they continue to perform their respective missions, without skipping a heartbeat, while positioning them for the next phase of their evolution.

    All this is in our view is simply promising. We’re by not by any mean finished with our research into the ingredients that will make up the ultimate solution architecture. Considering the high-assurance nature of the code to which this solution will be applied, our work needs additional scaling studies, and far more testing and evaluation; however we are clearly seeing very promising results from the R&D effort we have been quietly conducting for the MHS since 2011 and the VHA since 2005, and I believe more firmly than ever, that the original claim can be substantiated. We are seeking to achieve a 100% push-button conversion, that can first translate, unify and document both the VHA and DoDs core VistA, CHCS and TC2 systems in a matter of hours into fully functional code with no requirements whatsoever for human intervention, and then be incrementally applied to all the field variations of MUMPS in the 207 VHA and MUMPS hospitals as a seamless procedure.

    “Interestingly, Philip Newcomb, CEO of The Software Revolution Inc., has asserted that his company’s technology could convert VistA from MUMPS to J2EE in about a year for $10m. If true, this would be a bargain, IMHO”

    Reply
  9. Srinidhi Boray

    Regarding iEHR – VistA and ALHTA interoprability

    Conquering Uncertainty Create Infinite Possibilities.

    Few points observed from several discussions with several folks in the industry, including few participants on this blog.

    1. Mumps is not really the problem. As noted by Rama, it stands the test of all software analysis, especially static analysis and other architecture assessment. No doubt it is archiach.
    2. Transforming architecture from one syntax or semantics into another modern programming language or modern architecture is also not the formidable challenge. Transforming legacy into modern architure is not the central unsolvable concern.
    3. As Tom Munnecke makes assertion, the medical semantics inherently follows a structure that is not amenable to standard relational structure, so we enter into the realm of NoSQL, where unstructured data also begins to be managed along side data gleamed from EHR. Furthermore, Tom suggests employ of algebra driven algorithm as that is better suited than arithmetic to create architecture, also that is most suited to create iEHR – interoperability among system of systems, solving massive uncertainity, complex systemic challenges overcoming the dilemma of best of breed solutions in the market. This can be achieved at significantly lower cost as billions of dollars is touted from its initial estimate of $ 4 Billion ( $8 Billion in 2013) for ALHTA and VistA and rest of the world.
    4. Gartner, in its assessment said that VistA is not Gen 3 ready. Which basically says VistA is not matured enough to develop into iEHR and to deliver Evidence Based Medicine ( EBM ). This was hotly contested by OSHERA.
    5. Importantly VistA and ALHTA have captured millions of records over 30 years, when these records begin to merge with genomics ( MVP – Million Vet Program) they will create massive insights for the entire healthcare industry world over, besides proving path into EBM and pharmacogenomics ( delivers personalized medicine driven by gene therapy instead of pathology)
    6. US and EU healthcare interoperability offers opportunity to mine massive data 700 millions plus records. This can become cradle for all future healthcare discoveries, when biologics gets integrated into genomics, EHR.

    With above point an approach – THE QEXL Approach described in the below link was developed driven by algebra that creates a declarative multivariate cognitive architecture developing into Probabilistic Ontology suited to deliver EBM, pharmacogenomics etc.

    http://ingine.wordpress.com/2013/08/16/the-qexl-approach-universal-healthcare-interoperability-based-on-probabilistic-ontology/

    Thanks
    Srinidhi Boray

    Side notes:-

    For several years have been probing Implicate Order as a way for modeling a complex system owing to the fact that all system of systems are probabilistically deterministic. Also, what is evident in the methods / techniques chosen to design and develop a complex system – there exists Cartesian dilemma, This means the method used to study the macro behavior and the micro behavior both follow different mathematical scheme. Almost all micro behaviors are characterized by Cartesian methods, these breakdown when they are scaled to study macro behavior. This particular vexation is the reason why arriving at unified equation in physics is fleeting that reconciles macro physics with quantum behavior. Probing for a way to describe a system with such dilemma, somehow Implicate Order as a metaphor seems to fit the probabilistic paradigm.

    Every system strategically has a challenge to resolve at macro level pertinent to a “context” – for instance healthcare management efficiency and then at micro level it is challenge to be met at “functional” level – example personalized healthcare delivery – which is efficacy. These two combined at this time is among the most compelling problem facing entire world. Obama healthcare reform is one such initiative to contain the complexity in a system that has gone bizarre and unmanageable. Unfortunately this systemic problem is mostly man created.

    Last few months have been interacting with industry experts on bioinformatics and linguistic semantics, both apply computational mathematics to create probabilistic inference to reconcile the semantic differences in the vast uncertain information that exists in the system. Also, a highly complex system is characterized by high degree of randomness ( stochasticity ). These approaches help in creating a probabilistic ontology as envisioned in the implicate order. Furthermore, the way probabilistic inferencing keeps growing as the system progress in time continuum they provide a means to study Complex Adaptive System and also trigger Generative development.

    Most common method in creating the probabilistic inference has been Bayesian network. But these are being contested as suitable for a static model and are not inherently dynamic so other methods are being proposed with far reaching results. All these work on inductive logic.

    There are several developments going on in the area of probabilistic modeling. Stanford has begun to offer free course on probabilistic graphical modeling based on Bayesian. In the link below go to preview link to access entire video coursework. Awesome coursework for beginners in system modeling.

    https://www.coursera.org/course/pgm?from_restricted_preview=1&course_id=22&r=https%3A%2F%2Fclass.coursera.org%2Fpgm%2Fauth%2Fauth_redirector%3Ftype%3Dlogin%26subtype%3Dnormal%26visiting%3Dhttps%253A%252F%252Fclass.coursera.org%252Fpgm%252Fclass%252Findex

    Reply
      1. Srinidhi Boray

        Oops!! My bad…I should have meant correction from ALHTA to AHLTA instead ::)

        Reply
  10. Valerie J H Powell

    David (Whitten), Thanks for your contributions on this topic. Many of you who wrote on this point may not be aware that one of the original purposes for MUMPS was to make it possible for medical professionals and programmers o work closely and iteratively together to help assure that EHR software would meet clinical needs as perceived by the practicing clinician. This is one reason, I suspect, for the terse code. The best treatment of medical/programmer collaboration using M[UMPS] to develop and refine VistA is Rick Marshall’s book on VistA development (VistA Mastery, available at: http://vistaexpertise.net/mumpsbooks.html) . I have been teaching M[UMPS] at the university level since 1981 (when I used a PDP-11 donated by a medical lab company from the Dallas area). When I was completing advanced studies in computer science I was appalled at the lack of scientific frame-of-mind in assessing M[UMPS). There is an arrogance in computer science revealed in such behavior, for while CS has seen many advances, there are radically advanced ideas in M (from its MIT-graduate principal designer) that I have found from experience most CS people have difficulty grasping. If all these new languages are so great, why do they experience so many information security challenges? (I teach information security using virtual machine arrays). When I get time to program in M[UMPS], it is like using a Porsche instead of an old Chevrolet. Having done quite a bit of teaching with Pascal earlier, I am strict about what in M[UMPS] I let my students use, and strict about them enforcing discipline and design. When I teach SQL, I insist my students be capable of analyzing relational design on the back of a napkin if necessary and that they know the strict (irreflexive) partial order of foreign key references in relational database design. I have found that students make predictable errors in SQL query design (which I reported to AMIA in 2004). Even experienced SQL programmers made such errors. Only a few SQL professionals, proportionately, know systematically how to avoid such errors (exclusionary queries). I had to turn to the most highly experienced psycholinguists help analyze this behavior (I have a PhD minor in linguistics from UT-Austin) to analyze this (mis-)handling of SQL syntax. One of my strange impressions in teaching is finding that in spite of the syntactic differences, students who learn the Prolog logic programming language (with its version of relations) subsequently do better with SQL, but I have not taken the time to explore that systematically.

    Best wishes,

    Valerie

    Valerie Powell

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *