Tuesday, February 28, 2012

JAVA: How can I download an HTML file from a site that requires cookies enabled?


I'm trying to download an HTML file from a site. I'm using the following simple method:




URL url = new URL("here goes the link to the html file");
BufferedReader br = new BufferedReader(new InputStreamReader(url.openStream()));
String htmlfile = "";
String temp;
while ((temp = br.readLine()) != null) {
htmlfile+= temp;
}



The problem is that I get the following String in the htmlfile variable:




The installation of ... requires the acceptance of a cookie by your browser
software. The cookie is used to ensure that you and only you are
able to access information ....



In other words, I need to somewhat enable cookies when opening a stream from the url. Is it possible to achieve this by using URL or do I need a different method? Thanks in advance

2 comments:

  1. If you use a good library like Apache HttpComponents

    http://hc.apache.org/index.html

    it takes care of cookie-management for you.

    ReplyDelete
  2. You can use addRequestProperty() to set a cookie on a URLConnection object, e.g.

    URL url = new URL("here goes the link to the html file");
    URLConnection connection = url.openConnection();
    connection.addRequestProperty("Cookie", "here goes the cookie");
    BufferedReader br = new BufferedReader(new InputStreamReader(connection.getInputStream()));

    ReplyDelete