Weil Jimmer's BlogWeil Jimmer's Blog


Category:Experience

Found 52 records. At Page 7 / 11.

PHP string to html characters 字串轉HTML碼

No Comments
-
更新於 2016-01-11 20:09:41

緣由:

最近因為裝了PHP7的緣故,又加上有一點點的閒,故研究PHP這塊。尤其是運算耗時方面,因為我本身就是要測試PHP7可以快到何種地步。聽說是快20倍的樣子。

結果,到最後我反而開始寫程式碼,昨天修正BBcode檢查的函數,運行約一萬字的代碼,結果速度慢到極致,被我修改後,倒加速不少,總速度提升82倍,原本12.3秒,修正到0.15秒(如果僅比較檢查BBcode的函數,加快上萬倍。),後來遇上了這個轉HTMLcode的問題,原先寫得太慢了,這是我第三次修正!

//目前我寫的最快的版本。

function string_to_utf16_ascii_HTML_new2($str){
	$str=mb_convert_encoding($str, 'ucs-2', 'utf-8');
	for($i=0;$i<strlen($str);$i+=2){
		$str2.='&#'.str_pad((ord($str[$i])*256+ord($str[$i+1])),5,'0',STR_PAD_LEFT).';';
	}
	return $str2;
}

//稍慢的版本(舊版)

function string_to_utf16_ascii_HTML_new($str){
	for ($i=0;$i<strlen($str);$i++){
		$k=ord($str[$i]);
		if ($k >= 192 and $k <= 223){
			$k=$k-192;
			$k2=ord($str[$i+1])-128;
			$c=$c.'&#'.str_pad($k*64+$k2,5,'0',STR_PAD_LEFT).';';
		}elseif ($k >= 224 and $k <= 239){
			$k=$k-224;
			$k2=ord($str[$i+1])-128;
			$k3=ord($str[$i+2])-128;
			$c=$c.'&#'.str_pad($k*4096+$k2*64+$k3,5,'0',STR_PAD_LEFT).';';
		}elseif ($k >= 128 and $k <= 191){
		}else{
			$c=$c.'&#'.str_pad($k,5,'0',STR_PAD_LEFT).';';
		}
	}
	return $c;
}

//極慢的版本(初版)

function string_to_utf16_ascii_HTML($str){
	for ($i=0;$str[$i]!="";$i++){
		$b[]=ord($str[$i]);
	}
	$str2=',';
	for ($i=0;$str[$i]!="";$i++){
		$xa=str_pad(decbin($b[$i]),8,'0',STR_PAD_LEFT);
		if (substr($xa,0,3) == "110"){
			$xb=str_pad(decbin($b[$i+1]),8,'0',STR_PAD_LEFT);
			$xx1=substr($xa,3);
			$xx2=substr($xb,2);
			$c=$c.$str2.str_pad(bindec($xx1.$xx2),5,'0',STR_PAD_LEFT);
		}elseif (substr($xa,0,4) == "1110"){
			$xb=str_pad(decbin($b[$i+1]),8,'0',STR_PAD_LEFT);
			$xc=str_pad(decbin($b[$i+2]),8,'0',STR_PAD_LEFT);
			$xx1=substr($xa,3);
			$xx2=substr($xb,2);
			$xx3=substr($xc,2);
			$c=$c.$str2.str_pad(bindec($xx1.$xx2.$xx3),5,'0',STR_PAD_LEFT);
		}elseif (substr($xa,0,2) == "10"){
		}else{
			$c=$c.$str2.str_pad($b[$i],5,'0',STR_PAD_LEFT);
		}
	}
	return substr($c,1);
}

//將HTMLcode轉回UTF-8字串。

function utf16_ascii_HTML_to_string($str){
	return mb_convert_encoding($str, 'UTF-8', 'HTML-ENTITIES'); 
}

/********

使用方法:

*********

$str='這是測試123';
$encode_str=string_to_utf16_ascii_HTML_new2($str);
echo $encode_str;

//output:
//&#36889;&#26159;&#28204;&#35430;&#00049;&#00050;&#00051;
//以瀏覽器顯示,等同於「這是測試123」。

echo utf16_ascii_HTML_to_string($encode_str);

//output:
//這是測試123

*/

 


This entry was posted in General, Experience, Functions, Note, PHP By Weil Jimmer.

長時間網站掛掉的感覺

No Comments
-
發布於 2015-12-15 19:01:53

本站在一週前遭遇中國猛烈DDOS攻擊,導致網站下線。我真心很想罵服務商,不過因為是免費的,就不好意思開口,很想說不會搞主機還學別人開放服務,趕快回去讀書吧。

本站經過多重搬遷 從原本的 Nazuka 搬到 GoogieHost 再搬到 EcVps 又再次搬到 LionFree (不推薦使用),本次被攻擊也是 只有 LionFree 才有的情況,以往頂多是暫時無法訪問,所以我又搬遷了一次主機。

心情很差,網站Offline時,很多事情不方便,此外,也沒辦法向身邊人傳達訊息。

(免費的真的很差勁!凡事都要備份!)


This entry was posted in Experience, Mood By Weil Jimmer.

程式設計缺陷足以修改Header偽造IP

No Comments
-
更新於 2017-05-19 18:59:14

這其實是個老梗,很久以前就知道了。今天又突然想到,發篇文章好了。

很多網站上的程式教學都是錯誤的,取得IP方法用get Header 的 Client-IP 及 X-Forwarded-For

真的是禍害,Header是可以給用戶隨意變更的。

我修改了 Header 在網站上隨意搜尋 What is my ip ?。

結果大部分網站都有問題!

我怎麼改,網站就怎麼顯示,怎麼都對。如果今天我把Header改成<script>標籤,不就造成了XSS攻擊漏洞?改成SQL語法,造成SQL注入漏洞。

只是剛好想到,之前也沒發過類似的文章,補發。


This entry was posted in General, Experience, HTML, The Internet, PHP By Weil Jimmer.

Python 進階下載器

No Comments
-
更新於 2015-11-20 18:47:40

我會寫這個,一來是為了讓朋友羨慕也想學程式,二來是我自己要用。我不會閒閒開發一個我用不到的東西。總是開發者要支持一下自己的產品嘛。

這是用Python3寫成的程式,主要針對手機而使用。需要安裝BeautifulSoup4。

(在Windows下的命令提示字元顯示很醜,沒有顏色,實際上在手機Linux終端機裡面跑的時候,會有顏色。)

主要支援以下功能:

一、抓取目標URL的目標連結

例如我要抓取IG某個頁面的所有圖片。

二、載入網頁列表,抓取目標連結

例如:載入某網站相簿的第一頁,抓取圖片,然後載入第二頁,抓取圖片,載入第三頁……以此類推。

三、規律網址抓取,這個算是最低階的方法吧。

例如:下載http://example.com/1.jpg,下載http://example.com/2.jpg,下載http://example.com/3.jpg,下載/4.jpg下載/5.jpg……

四、顯示目標清單

五、下載清單上的連結

至於抓圖功能,我可以稱進階抓圖器是沒有講假的,雖然還比不上我用VB.NET寫出來的 強大。那種仿一般正常用戶框架又有COOKIE、HEADER、還解析JS,Python很難辦得到。

所以,頂多次級一點。

支援:

一、抓取頁面上所有「看起來是網址」的連結。(即便它沒有被鑲入在任何標籤內)(採用正規表達式偵測)

二、抓取A標籤的屬性HREF。(超連結)

三、抓取IMG標籤的屬性SRC。(圖片)

四、抓取SOURCE標籤的屬性SRC。(HTML5的audio、movie)

五、抓取EMBED標籤的SRC屬性。(FLASH)

六、抓取OBJECT標籤的DATA屬性。(網頁插件)

七、LINK標籤的HREF屬性。(CSS)

八、SCRIPT標籤的SRC屬性。(JS)

九、FRAME標籤的SRC屬性。(框架)

十、IFRAME標籤的SRC屬性。(內置框架)

十一、以上全部。

十二、自訂抓取標籤名稱與屬性名稱。(這個我VB板的進階抓圖器沒有這項功能)

支援 過濾關鍵字,包刮AND、OR邏輯閘,一定要全部包刮關鍵字,或是命中其一關鍵字。

規律網址下載則支援,起始數字、終止數字、每次遞增多少、補位多少。

※這個有相對位置的處理。

****************************************

* 名稱:進階下載器

* 團隊:White Birch Forum Team

* 作者:Weil Jimmer

* 網站:http://0000.twgogo.org/

* 時間:2015.09.26

****************************************

Source Code

# coding: utf-8
"""Weil Jimmer For Safe Test Only"""
import os,urllib.request,shutil,sys,re
from threading import Thread
from time import sleep
from sys import platform as _platform

GRAY = "\033[1;30m"
RED = "\033[1;31m"
LIME = "\033[1;32m"
YELLOW = "\033[1;33m"
BLUE = "\033[1;34m"
MAGENTA = "\033[1;35m"
CYAN = "\033[1;36m"
WHITE = "\033[1;37m"
BGRAY = "\033[1;47m"
BRED = "\033[1;41m"
BLIME = "\033[1;42m"
BYELLOW = "\033[1;43m"
BBLUE = "\033[1;44m"
BMAGENTA = "\033[1;45m"
BCYAN = "\033[1;46m"
BDARK_RED = "\033[1;48m"
UNDERLINE = "\033[4m"
END = "\033[0m"

if _platform.find("linux")<0:
	GRAY = ""
	RED = ""
	LIME = ""
	YELLOW = ""
	BLUE = ""
	MAGENTA = ""
	CYAN = ""
	WHITE = ""
	BGRAY = ""
	BRED = ""
	BLIME = ""
	BYELLOW = ""
	BBLUE = ""
	BMAGENTA = ""
	BCYAN = ""
	UNDERLINE = ""
	END = ""
	os.system("color e")

try:
    import pip
except:
	print(RED + "錯誤沒有安裝pip!" + END)
	input()
	exit()

try:
    from bs4 import BeautifulSoup
except:
	print(RED + "錯誤沒有安裝bs4!嘗試安裝中...!" + END)
	pip.main(["install","beautifulsoup4"])
	from bs4 import BeautifulSoup

global phone_
phone_ = False

try:
	import android
	droid = android.Android()
	phone_ = True
except:
	try:
		import clipboard
	except:
		print(RED + "錯誤沒有安裝clipboard!嘗試安裝中...!" + END)
		pip.main(["install","PyGTK"])
		pip.main(["install","clipboard"])
		import clipboard

def get_clipboard():
	global phone_
	if phone_==True:
		return str(droid.getClipboard().result)
	else:
		return clipboard.paste()

global target_url
target_url = [[],[],[],[],[],[],[],[],[]]

def __init__(self):
	print("")

print (RED)
print ("*" * 40)
print ("*  Name:\tWeil_Advanced_Downloader")
print ("*  Team:" + LIME + "\tWhite Birch Forum Team" + RED)
print ("*  Developer:\tWeil Jimmer")
print ("*  Website:\thttp://0000.twgogo.org/")
print ("*  Date:\t2015.10.09")
print ("*" * 40)
print (END)

root_dir = "/sdcard/"
print("根目錄:" + root_dir)
global save_temp_dir
global save_dir
save_dir=str(input("存檔資料夾:"))
save_temp_dir=str(input("暫存檔資料夾(會自動刪除):"))

global target_array_index
target_array_index = 0

def int_s(k):
	try:
		return int(k)
	except:
		return -1

def reporthook(blocknum, blocksize, totalsize):
	readsofar = blocknum * blocksize
	if totalsize > 0:
		percent = readsofar * 1e2 / totalsize
		s = "\r%5.1f%% %*d / %d bytes" % (percent, len(str(totalsize)), readsofar, totalsize)
		sys.stderr.write(s)
		if readsofar >= totalsize:
			sys.stderr.write("\r" + MAGENTA + "%5.1f%% %*d / %d bytes" % (100, len(str(totalsize)), totalsize, totalsize))
	else:
		sys.stderr.write("\r未知檔案大小…下載中…" + str(readsofar) + " bytes")
		#sys.stderr.write("read %d\n" % (readsofar,))

def url_encode(url_):
	if url_.startswith("http://"):
		return 'http://' + urllib.parse.quote(url_[7:])
	elif url_.startswith("https://"):
		return 'https://' + urllib.parse.quote(url_[8:])
	elif url_.startswith("ftp://"):
		return 'ftp://' + urllib.parse.quote(url_[6:])
	elif ((not url_.startswith("ftp://")) and (not url_.startswith("http"))):
		return 'http://' + urllib.parse.quote(url_)
	return url_

def url_correct(url_):
	if ((not url_.startswith("ftp://")) and (not url_.startswith("http"))):
		return 'http://' + (url_)
	return url_

def download_URL(url,dir_name,ix,total,encode,return_yes_no):
	global save_temp_dir
	prog_str = "(" + str(ix) + "/" + str(total) + ")"
	if (total==0):
		prog_str=""
	file_name = url.split('/')[-1]
	file_name=file_name.replace(":","").replace("*","").replace('"',"").replace("\\","").replace("|","").replace("?","").replace("<","").replace(">","")
	if file_name=="":
		file_name="NULL"
	try:
		print(YELLOW + "下載中…" + prog_str + "\n" + url + "\n" + END)
		if not os.path.exists(root_dir + dir_name + "/") :
			os.makedirs(root_dir + dir_name + "/")
		opener = urllib.request.FancyURLopener({})
		opener.version = 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'
		opener.addheader("Referer", url)
		opener.addheader("X-Forwarded-For", "0.0.0.0")
		opener.addheader("Client-IP", "0.0.0.0")
		local_file,response_header=opener.retrieve(url_encode(url), root_dir + dir_name + "/" + str(ix) + "-" + file_name, reporthook)
		print(MAGENTA + "下載完成" + prog_str + "!" + END)
	except urllib.error.HTTPError as ex:
		print(RED + "下載失敗" + prog_str + "!" + str(ex.code) + END)
	except:
		print(RED + "下載失敗" + prog_str + "!未知錯誤!" + END)
	if return_yes_no==0:
		return ""
	try:
		k=open(local_file,encoding=encode).read()
	except:
		k="ERROR"
		print(RED + "讀取失敗!" + END)
	try:
		if dir_name==save_temp_dir:
			shutil.rmtree(root_dir + save_temp_dir + "/")
	except:
		print(RED + "刪除暫存資料夾失敗!" + END)
	return k

def check_in_filter(url_array,and_or,keyword_str):
	if keyword_str=="":
		return url_array
	url_filter_array = []
	s = keyword_str.split(',')
	for array_x in url_array:
		ok = True
		for keyword_ in s:
			if str(array_x).find(keyword_)>=0:
				if and_or==0:
					url_filter_array.append(array_x)
					ok=False
					break
			else:
				if and_or==1:
					ok=False
					break
		if ok==True:
			url_filter_array.append(array_x)
	return url_filter_array

def handle_relative_url(handle_url,ori_url):
	handle_url=str(handle_url)
	if handle_url=="":
		return ori_url
	if handle_url.startswith("?"):
		temp_form_url = ori_url
		search_A = ori_url.find("?")
		if search_A<0:
			return ori_url + handle_url
		else:
			return ori_url[0:search_A] + handle_url
	if handle_url.startswith("//"):
		return "http:" + handle_url
	if (handle_url.startswith("http://") or handle_url.startswith("https://") or handle_url.startswith("ftp://")):
		return handle_url
	root_url = ori_url
	search_ = root_url.find("//")
	if search_<0:
		return handle_url
	search_x = root_url.find("/", search_+2);
	if (search_x<0):
		root_url = ori_url
	else:
		root_url = ori_url[0:search_x]
	same_dir_url = ori_url[search_+2:]
	search_x2 = same_dir_url.rfind("/")
	if search_x2<0:
		same_dir_url = ori_url
	else:
		same_dir_url = ori_url[0:search_x2+search_+2]
	if handle_url.startswith("/"):
		return (root_url + handle_url)
	if handle_url.startswith("./"):
		return (same_dir_url + handle_url[1:])
	return (same_dir_url + "/" + handle_url)

def remove_duplicates(values):
	output = []
	seen = set()
	for value in values:
		if value not in seen:
			output.append(value)
			seen.add(value)
	return output

def get_text_url(file_content):
	url_return_array = re.findall('(http|https|ftp)://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?', file_content)
	return url_return_array

def get_url_by_tagname_attribute(file_content,tagname,attribute,url_):
	soup = BeautifulSoup(file_content,'html.parser')
	url_return_array = []
	for link in soup.find_all(tagname):
		if link.get(attribute)!=None:
			url_return_array.append(handle_relative_url(link.get(attribute),url_))
	return url_return_array

def get_url_by_targetid_attribute(file_content,tagname,attribute,url_):
	soup = BeautifulSoup(file_content,'html.parser')
	url_return_array = []
	for link in soup.find_all(id=tagname):
		if link.get(attribute)!=None:
			url_return_array.append(handle_relative_url(link.get(attribute),url_))
	return url_return_array

def get_url_by_targetname_attribute(file_content,tagname,attribute,url_):
	soup = BeautifulSoup(file_content,'html.parser')
	url_return_array = []
	for link in soup.find_all(name=tagname):
		if link.get(attribute)!=None:
			url_return_array.append(handle_relative_url(link.get(attribute),url_))
	return url_return_array

def run_functional_get_url(way_X,html_code,target_array_index,and_or,keywords,ctagename,cattribute):
	global target_url
	if (way_X==1):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_text_url(html_code)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==2):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"a","href",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==3):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"img","src",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==4):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"source","src",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==5):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"embed","src",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==6):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"object","data",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==7):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"link","href",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==8):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"script","src",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==9):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"frame","src",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==10):
		ori_size=len(target_url[target_array_index])
		get_array_ = get_url_by_tagname_attribute(html_code,"iframe","src",temp_url)
		target_url[target_array_index].extend(get_array_)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==11):
		ori_size=len(target_url[target_array_index])
		get_array_1 = get_text_url(html_code)
		get_array_2 = get_url_by_tagname_attribute(html_code,"a","href",temp_url)
		get_array_3 = get_url_by_tagname_attribute(html_code,"img","src",temp_url)
		get_array_4 = get_url_by_tagname_attribute(html_code,"source","src",temp_url)
		get_array_5 = get_url_by_tagname_attribute(html_code,"embed","src",temp_url)
		get_array_6 = get_url_by_tagname_attribute(html_code,"object","data",temp_url)
		get_array_7 = get_url_by_tagname_attribute(html_code,"link","href",temp_url)
		get_array_8 = get_url_by_tagname_attribute(html_code,"script","src",temp_url)
		get_array_9 = get_url_by_tagname_attribute(html_code,"frame","src",temp_url)
		get_array_10 = get_url_by_tagname_attribute(html_code,"iframe","src",temp_url)
		target_url[target_array_index].extend(get_array_1)
		target_url[target_array_index].extend(get_array_2)
		target_url[target_array_index].extend(get_array_3)
		target_url[target_array_index].extend(get_array_4)
		target_url[target_array_index].extend(get_array_5)
		target_url[target_array_index].extend(get_array_6)
		target_url[target_array_index].extend(get_array_7)
		target_url[target_array_index].extend(get_array_8)
		target_url[target_array_index].extend(get_array_9)
		target_url[target_array_index].extend(get_array_10)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==12):
		ori_size=len(target_url[target_array_index])
		get_array_custom = get_url_by_tagname_attribute(html_code,ctagename,cattribute,temp_url)
		target_url[target_array_index].extend(get_array_custom)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==13):
		ori_size=len(target_url[target_array_index])
		get_array_custom = get_url_by_targetid_attribute(html_code,ctagename,cattribute,temp_url)
		target_url[target_array_index].extend(get_array_custom)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)
	elif(way_X==14):
		ori_size=len(target_url[target_array_index])
		get_array_custom = get_url_by_targetname_attribute(html_code,ctagename,cattribute,temp_url)
		target_url[target_array_index].extend(get_array_custom)
		target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
		target_url[target_array_index]=check_in_filter(target_url[target_array_index],and_or,keywords)
		print( LIME + "抓取完成!共抓取到:" + str(len(target_url[target_array_index])-ori_size) + "個URL" + END)

while True:	
	while True:
		method_X=int_s(input("\n\n" + CYAN + "要執行的動作:\n(1)抓取目標網頁資料\n(2)載入網址列表抓取資料\n(3)規律網址抓取\n(4)顯示目前清單\n(5)下載目標清單\n(6)清空目標清單\n(7)複製清單\n(8)刪除目標清單的特定值\n(9)從剪貼簿貼上網址(每個一行)" + END + "\n\n"))
		if method_X<=9 and method_X>=1:
			break
		else:
			print("輸入有誤!\n\n")
	if (method_X==1):
		while True:
			target_array_index=int_s(input("\n\n" + CYAN + "您要「存入」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
			if target_array_index<=8 and target_array_index>=1:
				break
			else:
				print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
		while True:
			way_X=int_s(input("\n\n" + CYAN + "抓取方式:\n(1)搜尋頁面所有純文字網址\n(2)抓取所有A標籤HREF屬性\n(3)抓取所有IMG標籤SRC屬性\n(4)抓取所有SOURCE標籤SRC屬性\n(5)抓取所有EMBED標籤SRC屬性\n(6)抓取所有OBJECT標籤DATA屬性\n(7)抓取所有LINK標籤HREF屬性\n(8)抓取所有SCRIPT標籤SRC屬性\n(9)抓取所有FRAME標籤SRC屬性\n(10)抓取所有IFRAME標籤SRC屬性\n(11)使用以上所有方法\n(12)自訂找尋標籤名及屬性名\n(13)自訂找尋ID及屬性名\n(14)自訂找尋Name及屬性名" + END + "\n\n"))
			if way_X<=14 and way_X>=1:
				break
		else:
			print("輸入有誤!")
		if way_X==12 or way_X==13 or way_X==14:	
			target_tagname_=str(input("\n目標標籤名稱/ID/Name:"))
			target_attribute_=str(input("\n目標屬性名稱:"))
		else:
			target_tagname_=""
			target_attribute_=""
		temp_url=url_correct(str(input("\n目標網頁URL:")))
		temp_url_code=str(input("\n目標網頁編碼(請輸入utf-8或big5或gbk…):"))
		keywords=str(input("\n" + CYAN + "請輸入過濾關鍵字(可留空,可多個,逗號為分隔符號):" + END + "\n\n"))
		and_or=0
		if keywords!="":
			while True:
				and_or=(-1)
				try:
					and_or=int_s(input("\n" + CYAN + "請輸入關鍵字邏輯閘:(1=and、0=or)" + END + "\n\n"))
				except:
					print("")
				if and_or==0 or and_or==1:
					break
				else:
					print("輸入有誤!\n")
		html_code=download_URL(temp_url,save_temp_dir,0,0,temp_url_code,1)
		if html_code=="ERROR":
			continue
		run_functional_get_url(way_X,html_code,target_array_index,and_or,keywords,target_tagname_,target_attribute_)
		input("\n\n完成!請輸入ENTER鍵跳出此功能...");
	elif(method_X==2):
		while True:
			RUN_array_index=int_s(input("\n\n" + CYAN + "您要「載入」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
			if RUN_array_index<=8 and RUN_array_index>=1:
				break
			else:
				print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
		while True:
			target_array_index=int_s(input("\n\n" + CYAN + "您要「存入」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
			if target_array_index<=8 and target_array_index>=1:
				break
			else:
				print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
		while True:
			way_X=int_s(input("\n\n" + CYAN + "抓取方式:\n(1)搜尋頁面所有純文字網址\n(2)抓取所有A標籤HREF屬性\n(3)抓取所有IMG標籤SRC屬性\n(4)抓取所有SOURCE標籤SRC屬性\n(5)抓取所有EMBED標籤SRC屬性\n(6)抓取所有OBJECT標籤DATA屬性\n(7)抓取所有LINK標籤HREF屬性\n(8)抓取所有SCRIPT標籤SRC屬性\n(9)抓取所有FRAME標籤SRC屬性\n(10)抓取所有IFRAME標籤SRC屬性\n(11)使用以上所有方法\n(12)自訂找尋標籤名及屬性名\n(13)自訂找尋ID及屬性名\n(14)自訂找尋Name及屬性名" + END + "\n\n"))
			if way_X<=14 and way_X>=1:
				break
		else:
			print("輸入有誤!")
		if way_X==12 or way_X==13 or way_X==14:
			target_tagname_=str(input("\n目標標籤名稱/ID/Name:"))
			target_attribute_=str(input("\n目標屬性名稱:"))
		else:
			target_tagname_=""
			target_attribute_=""
		keywords=str(input("\n" + CYAN + "請輸入過濾關鍵字(可留空,可多個,逗號為分隔符號):" + END + "\n\n"))
		while True:
			and_or=int_s(input("\n" + CYAN + "請輸入關鍵字邏輯閘:(1=and、0=or)" + END + "\n\n"))
			if and_or==0 or and_or==1:
				break
			else:
				print("輸入有誤!\n\n")
		temp_url_code=str(input("\n集合的網頁編碼(請輸入utf-8或big5或gbk…):"))
		for x in range(0,(len(target_url[RUN_array_index]))):
			html_code=download_URL(str(target_url[RUN_array_index][x]),save_temp_dir,(x+1),len(target_url[RUN_array_index]),temp_url_code,1)
			if html_code=="ERROR":
				continue
			run_functional_get_url(way_X,html_code,target_array_index,and_or,keywords,target_tagname_,target_attribute_)
		input("\n\n完成!請輸入ENTER鍵跳出此功能...");
	elif(method_X==3):
		start_number=int_s(input("起始點(數字):"))
		end_number=int_s(input("終止點(數字):"))
		step_ADD=int_s(input("每次遞增多少:"))
		str_padx=int_s(input("補滿位數至:"))
		if not os.path.exists('/sdcard/' + save_dir) :
			os.makedirs('/sdcard/' + save_dir )
		print(LIME + "※檔案將存在/sdcard/" + save_dir + "資料夾。" + END)
		while True:
			url=url_correct(input(LIME + "目標URL({i}是遞增數):" + END))
			if url.find("{i}")>=0:
				break
			else:
				print("網址未包含遞增數,請重新輸入網址。")
		for x in range(start_number,(end_number+1),step_ADD):
			download_URL(url.replace("{i}",str(x).zfill(str_padx)),save_dir,x,(end_number),"utf-8",0)
		input("\n\n完成!請輸入ENTER鍵跳出此功能...")
	elif(method_X==4):
		while True:
			RUN_array_index=int_s(input("\n\n" + CYAN + "您要「顯示」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
			if RUN_array_index<=8 and RUN_array_index>=1:
				break
			else:
				print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
		for x in range(0,(len(target_url[RUN_array_index]))):
			print("URL (" + str(x+1) + "/" + str(len(target_url[RUN_array_index])) + "):" + str(target_url[RUN_array_index][x]))
		input("\n\n完成!請輸入ENTER鍵跳出此功能...")
	elif(method_X==5):
		while True:
			RUN_array_index=int_s(input("\n\n" + CYAN + "您要「下載」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
			if RUN_array_index<=8 and RUN_array_index>=1:
				break
			else:
				print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
		for x in range(0,(len(target_url[RUN_array_index]))):
			download_URL(str(target_url[RUN_array_index][x]),save_dir,(x+1),len(target_url[RUN_array_index]),"utf-8",0)
		input("\n\n完成!請輸入ENTER鍵跳出此功能...")
	elif(method_X==6):
		ver = str(input("\n\n" + RED + "確定清空目標清單?(y/n)" + END + "\n\n"))
		if ver.lower()=="y":
			while True:
				RUN_array_index=int_s(input("\n\n" + CYAN + "您要「清空」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
				if RUN_array_index<=8 and RUN_array_index>=1:
					break
				else:
					print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
			target_url[RUN_array_index]=[]
			input("\n\n完成!請輸入ENTER鍵跳出此功能...")
	elif(method_X==7):
		ver = str(input("\n\n" + RED + "確定複製目標清單?(y/n)" + END + "\n\n"))
		if ver.lower()=="y":
			while True:
				RUN_array_index=int_s(input("\n\n" + CYAN + "您要「複製」的來源清單:(請輸入編號1~8)" + END + "\n\n"))
				if RUN_array_index<=8 and RUN_array_index>=1:
					break
				else:
					print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
			while True:
				target_array_index=int_s(input("\n\n" + CYAN + "您要「存入」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
				if target_array_index<=8 and target_array_index>=1:
					break
				else:
					print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
			target_url[target_array_index]=target_url[RUN_array_index]
			input("\n\n完成!請輸入ENTER鍵跳出此功能...")
	elif(method_X==8):
		ver = str(input("\n\n" + RED + "確定刪除目標清單特定值?(y/n)" + END + "\n\n"))
		if ver.lower()=="y":
			while True:
				RUN_array_index=int_s(input("\n\n" + CYAN + "您要「刪除值」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
				if RUN_array_index<=8 and RUN_array_index>=1:
					break
				else:
					print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
			if len(target_url[RUN_array_index])!=0:
				while True:
					target_array_index=int_s(input("\n\n" + CYAN + "您要「刪除的值編號」:(請輸入編號0~" + str(len(target_url[RUN_array_index])-1) + ")" + END + "\n\n"))
					if target_array_index>=0 and target_array_index<=(len(target_url[RUN_array_index])-1):
						break
					else:
						print("輸入有誤!請輸入0~" + str(len(target_url[RUN_array_index])-1) + "之間的號碼!")
				del target_url[RUN_array_index][target_array_index]
			else:
				print("空清單!無任何值!故無法刪除。")
			input("\n\n完成!請輸入ENTER鍵跳出此功能...")
	elif(method_X==9):
		ver = str(input("\n\n" + RED + "確定從剪貼簿貼上目標清單?(y/n)" + END + "\n\n"))
		if ver.lower()=="y":
			while True:
				target_array_index=int_s(input("\n\n" + CYAN + "您要「存入」的目標清單:(請輸入編號1~8)" + END + "\n\n"))
				if target_array_index<=8 and target_array_index>=1:
					break
				else:
					print("輸入有誤!只開放8個清單,請輸入1~8之間的號碼!")
			kk=get_clipboard()
			if kk=="" or kk==None:
				print(RED + "剪貼簿是空的!" + END)
			else:
				ori_size=len(target_url[target_array_index])
				target_url[target_array_index].extend(kk.split("\n"))
				target_url[target_array_index]=remove_duplicates(target_url[target_array_index])
				print(LIME + "已添加進去 " + str(len(target_url[target_array_index])-ori_size) + " 個不重複的URL。" + END)
			input("\n\n完成!請輸入ENTER鍵跳出此功能...")
input("\n\n請輸入ENTER鍵結束...")

 


This entry was posted in General, Experience, Free, Functions, Note, Product, Python By Weil Jimmer.

IME無法啟動解決方法

No Comments
-
發布於 2015-09-26 14:13:00

如圖,如果說切換到 非打字欄位出現這畫面很正常,但反之,若切換到可輸入資料的欄位卻無法正常顯示,可以先嘗試使用 Ctrl + Space (空白鍵)。若依舊無法啟動,表示此程式可能已關閉/崩潰。

Try:Ctrl + R,輸入 「C:\Windows\system32\ctfmon.exe」按下Enter。輸入法就應該會回來了。


This entry was posted in General, Experience, Functions By Weil Jimmer.

最前頁 上一頁  1 2 3 4 5 6 7 8 9 10 11 /11 頁)下一頁 最終頁

Visitor Count

pop
nonenonenone

Note

台灣假新聞橫行,沒一家霉體能信的,網軍側翼到處洗風向,堪憂。

毋忘初心,
絕不利慾薰心。

支持網路中立性.
Support Net Neutrality.

飽暖思淫欲,
饑寒起盜心。

支持臺灣實施無條件基本收入

歡迎前來本站。

Words Quiz


Search

Music

Blogging Journey

4480days

since our first blog post.

The strong do what they can and the weak suffer what they must.

Privacy is your right and ability to be yourself and express yourself without the fear that someone is looking over your shoulder and that you might be punished for being yourself, whatever that may be.

It is quality rather than quantity that matters.

I WANT Internet Freedom.

Reality made most of people lost their childishness.

Justice,Freedom,Knowledge.

Without music life would be a mistake.

Support/Donate

This site also need a little money to maintain operations, not entirely without any cost in the Internet. Your donations will be the best support and power of the site.
MethodBitcoin Address
bitcoin1gtuwCjjVVrNUHPGvW6nsuWGxSwygUv4x
buymeacoffee
Register in linode via invitation link and stay active for three months.Linode

Support The Zeitgeist Movement

The Zeitgeist Movement

The Lie We Live

The Lie We Live

The Questions We Never Ask

The Questions We Never Ask

Man

Man

THE EMPLOYMENT

Man

In The Fall

In The Fall

Facebook is EATING the Internet

Facebook

Categories

Android (7)

Announcement (4)

Arduino (2)

Bash (2)

C (3)

C# (5)

C++ (1)

Experience (52)

Flash (2)

Free (13)

Functions (36)

Games (13)

General (60)

Git (2)

HTML (7)

Java (13)

JS (7)

Mood (24)

NAS (2)

Note (32)

Office (1)

OpenWrt (6)

PHP (9)

Privacy (4)

Product (12)

Python (4)

Software (11)

The Internet (25)

Tools (16)

VB.NET (8)

WebHosting (7)

Wi-Fi (5)

XML (4)