How to parse URL strings in Java

I tend to use apache httpclient as my preferred java http client. I hit an error with invalid symbols such as the space character in this url:

val urlString = "http://maps.google.com/maps?q=Merrick, NY"

val cm = new ThreadSafeClientConnManager()
val client = new DefaultHttpClient(cm)
val httpRequest = new HttpGet(urlString)

java.lang.IllegalArgumentException: Illegal character in query at index 38: http://maps.google.com/maps?q=Merrick, NY
at java.net.URI.create(URI.java:859)
at org.apache.http.client.methods.HttpGet.(HttpGet.java:69)

My first attempt would be to use java.net.URLEncoder.encode

java.net.URLEncoder.encode(urlString, "UTF-8")
res4: java.lang.String = http%3A%2F%2Fmaps.google.com%2Fmaps%3Fq%3DMerrick%2C+NY

But this doesn’t work, it’s only for forms and it tries to encode the entire url string.

Our goal is to convert just the param part of the url from “http://maps.google.com/maps?q=Merrick, NY” to “http://maps.google.com/maps?q=Merrick%2C%20NY”

The trick is to construct a URL object so we can get the separate components and then create a URI object from these components:

val urlString = "http://maps.google.com/maps?q=Merrick, NY"
val url = new java.net.URL(urlString)
val uri = new java.net.URI(url.getProtocol, url.getAuthority, url.getPath, url.getQuery, null)
val httpRequest = new HttpGet(uri)
httpRequest: org.apache.http.client.methods.HttpGet = org.apache.http.client.methods.HttpGet@15ec2337

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s