在本文中,我们将向您展示如何使用Jsoup检查URL是否要重定向。
1. URL重定向
通常,重定向URL将返回301或307的HTTP代码,并且目标URL将存在于响应标头“ location ”字段中。
查看HTTP Response标头示例
HTTP code : 301 moved permanently
{
Location=http://mkyong.com
Server=GSE,
Cache-Control=no-cache,
no-store,
max-age=0,
must-revalidate
}
2. Jsoup示例
2.1默认情况下,Jsoup将递归地遵循重定向并显示最终URL。
RedirectExample.java
package com.mkyong.crawler;
import java.io.IOException;
import org.jsoup.Connection.Response;
import org.jsoup.Jsoup;
public class RedirectExample {
public static void main(String[] args) throws IOException {
String url = "http://goo.gl/fb/gyBkwR";
Response response = Jsoup.connect(url).execute();
System.out.println(response.statusCode() + " : " + response.url());
}
}
输出量
200 : http://www.mkyong.com/mongodb/mongodb-remove-a-field-from-array-documents/
2.2要测试URL重定向,请将followRedirects
设置为false。
Response response = Jsoup.connect(url).followRedirects(false).execute();
System.out.println(response.statusCode() + " : " + response.url());
//check if URL is redirect?
System.out.println("Is URL going to redirect : " + response.hasHeader("location"));
System.out.println("Target : " + response.header("location"));
输出量
301 : http://goo.gl/fb/gyBkwR
Is URL going to redirect : true
Target : http://feeds.feedburner.com/~r/FeedForMkyong/~3/D_6Jqi4trqo/...
3.再次以Jsoup为例
3.1此示例将递归打印重定向URL。
RedirectExample.java
package com.mkyong.crawler;
import java.io.IOException;
import org.jsoup.Connection.Response;
import org.jsoup.Jsoup;
public class RedirectExample {
public static void main(String[] args) throws IOException {
String url = "http://goo.gl/fb/gyBkwR";
RedirectExample obj = new RedirectExample();
obj.crawl(url);
}
private void crawl(String url) throws IOException {
Response response = Jsoup.connect(url).followRedirects(false).execute();
System.out.println(response.statusCode() + " : " + url);
if (response.hasHeader("location")) {
String redirectUrl = response.header("location");
crawl(redirectUrl);
}
}
}
输出量
301 : http://goo.gl/fb/gyBkwR
301 : http://feeds.feedburner.com/~r/FeedForMkyong/~3/D_6Jqi4trqo/...
200 : http://www.mkyong.com/mongodb/mongodb-remove-a-field-from-array-documents/
参考文献
翻译自: https://mkyong.com/java/jsoup-check-redirect-url/