Decoding structured JSON arrays with circe in Scala -


suppose need decode json arrays following, there couple of fields @ beginning, arbitrary number of homogeneous elements, , other field:

[ "foo", "mcbar", true, false, false, false, true, 137 ] 

i don't know why choose encode data this, people weird things, , suppose in case have deal it.

i want decode json case class this:

case class foo(firstname: string, lastname: string, age: int, stuff: list[boolean]) 

we can write this:

import cats.syntax.either._ import io.circe.{ decoder, decodingfailure, json }  implicit val foodecoder: decoder[foo] = decoder.instance { c =>   c.focus.flatmap(_.asarray) match {     case some(fnj +: lnj +: rest) =>       rest.reverse match {         case agej +: stuffj =>           {             fn    <- fnj.as[string]             ln    <- lnj.as[string]             age   <- agej.as[int]             stuff <- json.fromvalues(stuffj.reverse).as[list[boolean]]           } yield foo(fn, ln, age, stuff)         case _ => left(decodingfailure("foo", c.history))       }     case none => left(decodingfailure("foo", c.history))   } } 

…which works:

scala> foodecoder.decodejson(json"""[ "foo", "mcbar", true, false, 137 ]""") res3: io.circe.decoder.result[foo] = right(foo(foo,mcbar,137,list(true, false))) 

but ugh, that's horrible. error messages useless:

scala> foodecoder.decodejson(json"""[ "foo", "mcbar", true, false ]""") res4: io.circe.decoder.result[foo] = left(decodingfailure(int, list())) 

surely there's way doesn't involve switching , forth between cursors , json values, throwing away history in our error messages, , being eyesore?


some context: questions writing custom json array decoders in circe come (e.g. this morning). specific details of how change in upcoming version of circe (although api similar; see this experimental project details), don't want spend lot of time adding example documentation, comes enough think deserve stack overflow q&a.

working cursors

there better way! can write more concisely while maintaining useful error messages working directly cursors way through:

case class foo(firstname: string, lastname: string, age: int, stuff: list[boolean])  import cats.syntax.either._ import io.circe.decoder  implicit val foodecoder: decoder[foo] = decoder.instance { c =>   val fnc = c.downarray    {     fn     <- fnc.as[string]     lnc     = fnc.deletegoright     ln     <- lnc.as[string]     agec    = lnc.deletegolast     age    <- agec.as[int]     stuffc  = agec.delete     stuff  <- stuffc.as[list[boolean]]   } yield foo(fn, ln, age, stuff) } 

this works:

scala> foodecoder.decodejson(json"""[ "foo", "mcbar", true, false, 137 ]""") res0: io.circe.decoder.result[foo] = right(foo(foo,mcbar,137,list(true, false))) 

but gives indication of errors happened:

scala> foodecoder.decodejson(json"""[ "foo", "mcbar", true, false ]""") res1: io.circe.decoder.result[foo] = left(decodingfailure(int, list(deletegolast, deletegoright, downarray))) 

also it's shorter, more declarative, , doesn't require unreadable nesting.

how works

the key idea interleave "reading" operations (the .as[x] calls on cursor) navigation / modification operations (downarray , 3 delete method calls).

when start, c hcursor hope points @ array. c.downarray moves cursor first element in array. if input isn't array @ all, or empty array, operation fail, , we'll useful error message. if succeeds, first line of for-comprehension try decode first element string, , leaves our cursor pointing @ first element.

the second line in for-comprehension says "okay, we're done first element, let's forget , move second". delete part of method name doesn't mean it's mutating anything—nothing in circe ever mutates in way users can observe—it means that element won't available future operations on resulting cursor.

the third line tries decode second element in original json array (now first element in our new cursor) string. when that's done, fourth line "deletes" element , moves end of array, , fifth line tries decode final element int.

the next line interesting:

    stuffc  = agec.delete 

this says, okay, we're @ last element in our modified view of json array (where earlier deleted first 2 elements). delete last element , move cursor up points @ entire (modified) array, can decode list of booleans, , we're done.

more error accumulation

there's more concise way can write this:

import cats.syntax.all._ import io.circe.decoder  implicit val foodecoder: decoder[foo] = (   decoder[string].prepare(_.downarray),   decoder[string].prepare(_.downarray.deletegoright),   decoder[int].prepare(_.downarray.deletegolast),   decoder[list[boolean]].prepare(_.downarray.deletegoright.deletegolast.delete) ).map4(foo) 

this work, , has added benefit if decoding fail more 1 of members, can error messages of failures @ same time. example, if have this, should expect 3 errors (for non-string first name, non-integral age, , non-boolean stuff value):

val bad = """[["foo"], "mcbar", true, "true", false, 13.7 ]"""  val badresult = io.circe.jawn.decodeaccumulating[foo](bad) 

and that's see (together specific location information each failure):

scala> badresult.leftmap(_.map(println)) decodingfailure(string, list(downarray)) decodingfailure(int, list(deletegolast, downarray)) decodingfailure([a]list[a], list(moveright, downarray, deletegoparent, deletegolast, deletegoright, downarray)) 

which of these 2 approaches should prefer matter of taste , whether or not care error accumulating—i find first little more readable.


Comments