개발/Jsoup
Web Crawling
Zziny
2021. 10. 26. 21:03
1. maven repository에서 Jsoup 가져오기 -> pom.xml
2. Jsoup api doc 참고
Overview (jsoup Java HTML Parser 1.14.3 API)
jsoup: Java HTML parser that makes sense of real-world HTML soup. jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CS
jsoup.org
활용 예
public void jsoupTest() throws IOException {
Element bodyElement = Jsoup.connect("https://comic.naver.com/webtoon/weekday").get().body();
Elements aTagList = bodyElement.select("#content > div.webtoon_spot2 > ul > li > div > a >img");
for (Element element : aTagList) {
logger.debug(element.attr("title"));
}
}
public void jsoupTest() throws IOException {
Elements bodyElements = Jsoup.connect("http://www.khoa.go.kr/oceangrid/koofs/kor/observation/obs_real_list.do")
.get().select("li .rig_value02");
for (Element element : bodyElements) {
logger.debug(element.attr("title").toString());
}
}