forum.lissyara.su

Добавлено: **2011-11-12 21:05:08**

Всем привет.

Появилась тут одна неприятность. С недавних пор, может быть пару трокйку дней стали появляться следубщие сообщения в логах:

Код: Выделить всё

Nov 12 21:58:10 asus kernel: vm_fault: pager read error, pid 50221 (dialog)
Nov 12 21:58:10 asus kernel: pid 50221 (dialog), uid 0: exited on signal 11 (core dumped)
Nov 12 21:58:24 asus kernel: g_vfs_done():ada0p2[READ(offset=142014492672, length=8192)]error = 5
Nov 12 21:58:24 asus kernel: vnode_pager_getpages: I/O read error
Nov 12 21:58:24 asus kernel: vm_fault: pager read error, pid 50362 (dialog)
Nov 12 21:58:24 asus kernel: pid 50362 (dialog), uid 0: exited on signal 11 (core dumped)
Nov 12 21:58:44 asus kernel: g_vfs_done():ada0p2[READ(offset=142014492672, length=8192)]error = 5
Nov 12 21:58:44 asus kernel: vnode_pager_getpages: I/O read error
Nov 12 21:58:44 asus kernel: vm_fault: pager read error, pid 53067 (dialog)
Nov 12 21:58:44 asus kernel: pid 53067 (dialog), uid 0: exited on signal 11 (core dumped)

Появляются эта шняга в процессе обновления портов. Что имеем:

Код: Выделить всё

$ uname -a
FreeBSD asus 9.0-RC2 FreeBSD 9.0-RC2 #0: Thu Nov 10 22:58:32 MSK 2011     vovas@asus:/usr/obj/usr/src/sys/ASUS  amd64
$

Что смонтировано:

Код: Выделить всё

$ mount
/dev/ada0p2 on / (ufs, local, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
procfs on /proc (procfs, local)
linprocfs on /compat/linux/proc (linprocfs, local)
linsysfs on /compat/linux/sys (linsysfs, local)
zfspool on /zfspool (zfs, local, nfsv4acls)
zfspool/tank on /zfspool/tank (zfs, local, nfsv4acls)
$

Грузился с livecd и делал команду:

Код: Выделить всё

# fsck -u ufs -t -y /dev/ada0p2

Нашел пару ошибок. Прогнал еще пару раз командой и в ребут. В итоге тоже самое. Комп начинает тупить приобновлении портов и какой-нить записи на системный диск.

Добавлено: **2011-11-12 21:35:33**

вообщето проблема не фс а раздела подкачки

Добавлено: **2011-11-12 21:37:06**

Как определили? и как решить эту проблему?

P.S.

Код: Выделить всё

$ swapinfo
Device          1K-blocks     Used    Avail Capacity
/dev/ada0p3       4194304        0  4194304     0%
$

Добавлено: **2011-11-12 22:52:16**

ну найдите другой винт, сделайте подкачку на нём и переключите на нее
посмотрите на симптомы
а вообще 9.0-RC2, не внушает доверия

Добавлено: **2011-11-12 23:39:51**

Я считаю, что диск умирает,
недавно с такими симптомами умер окончательно,
хорошо что в зеркале дело было.

Добавлено: **2011-11-13 0:26:52**

А смарт винта можно посмотреть?

Добавлено: **2011-11-13 16:13:14**

Я заюзал clonehdd и склонировал на такой же диск. Теперь проблемный диск ada1
Вот:

Код: Выделить всё

asus# smartctl -a /dev/ada1
smartctl 5.42 2011-10-20 r3458 [FreeBSD 9.0-RC2 amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Blue Serial ATA
Device Model:     WDC WD1600AAJS-00PSA0
Serial Number:    WD-WMAP92320622
LU WWN Device Id: 5 0014ee 0aac31e97
Firmware Version: 05.06H05
User Capacity:    160 041 885 696 bytes [160 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sun Nov 13 17:12:39 2011 MSK
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		( 3960) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  54) minutes.
Conveyance self-test routine
recommended polling time: 	 (   6) minutes.
SCT capabilities: 	       (0x103f)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   200   200   051    Pre-fail  Always       -       1836
  3 Spin_Up_Time            0x0003   161   159   021    Pre-fail  Always       -       2941
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       580
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000e   200   200   051    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   083   082   000    Old_age   Always       -       12856
 10 Spin_Retry_Count        0x0012   100   100   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0012   100   100   051    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       528
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       12
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       580
194 Temperature_Celsius     0x0022   102   085   000    Old_age   Always       -       41
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       7
198 Offline_Uncorrectable   0x0010   200   200   000    Old_age   Offline      -       4
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   051    Old_age   Offline      -       3

SMART Error Log Version: 1
ATA Error Count: 1920 (device log contains only the most recent five errors)
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1920 occurred at disk power-on lifetime: 12849 hours (535 days + 9 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 8c e7 6a 40  Error: UNC at LBA = 0x006ae78c = 7006092

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 01 00 8c e7 6a 00 08      00:27:03.002  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 08      00:27:03.002  READ LOG EXT
  60 01 00 8c e7 6a 00 08      00:27:00.289  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 08      00:27:00.288  READ LOG EXT
  60 01 00 8c e7 6a 00 08      00:26:57.433  READ FPDMA QUEUED

Error 1919 occurred at disk power-on lifetime: 12849 hours (535 days + 9 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 8c e7 6a 40  Error: UNC at LBA = 0x006ae78c = 7006092

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 01 00 8c e7 6a 00 08      00:27:00.289  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 08      00:27:00.288  READ LOG EXT
  60 01 00 8c e7 6a 00 08      00:26:57.433  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 08      00:26:57.433  READ LOG EXT
  60 01 00 8c e7 6a 00 08      00:26:54.719  READ FPDMA QUEUED

Error 1918 occurred at disk power-on lifetime: 12849 hours (535 days + 9 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 8c e7 6a 40  Error: UNC at LBA = 0x006ae78c = 7006092

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 01 00 8c e7 6a 00 08      00:26:57.433  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 08      00:26:57.433  READ LOG EXT
  60 01 00 8c e7 6a 00 08      00:26:54.719  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 08      00:26:54.719  READ LOG EXT
  60 01 00 8c e7 6a 00 08      00:26:51.719  READ FPDMA QUEUED

Error 1917 occurred at disk power-on lifetime: 12849 hours (535 days + 9 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 8c e7 6a 40  Error: UNC at LBA = 0x006ae78c = 7006092

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 01 00 8c e7 6a 00 08      00:26:54.719  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 08      00:26:54.719  READ LOG EXT
  60 01 00 8c e7 6a 00 08      00:26:51.719  READ FPDMA QUEUED
  60 01 00 8b e7 6a 00 08      00:26:51.428  READ FPDMA QUEUED
  60 01 00 8a e7 6a 00 08      00:26:51.212  READ FPDMA QUEUED

Error 1916 occurred at disk power-on lifetime: 12849 hours (535 days + 9 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 8c e7 6a 40  Error: UNC at LBA = 0x006ae78c = 7006092

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 01 00 8c e7 6a 00 08      00:26:51.719  READ FPDMA QUEUED
  60 01 00 8b e7 6a 00 08      00:26:51.428  READ FPDMA QUEUED
  60 01 00 8a e7 6a 00 08      00:26:51.212  READ FPDMA QUEUED
  60 01 00 89 e7 6a 00 08      00:26:50.754  READ FPDMA QUEUED
  60 01 00 88 e7 6a 00 08      00:26:50.695  READ FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%     12251         7006092
# 2  Short offline       Completed: read failure       90%     12227         7006092
# 3  Short offline       Completed: read failure       90%     12203         7006093
# 4  Short offline       Completed: read failure       90%     12179         7006089
# 5  Short offline       Completed: read failure       90%     12155         7006092
# 6  Extended offline    Completed: read failure       90%     12132         7006093
# 7  Short offline       Completed: read failure       90%     12131         7006092
# 8  Short offline       Completed: read failure       90%     12107         7006089
# 9  Short offline       Completed: read failure       90%     12083         7006092
#10  Short offline       Completed: read failure       90%     12059         7006092
#11  Short offline       Completed: read failure       90%     12035         7006092
#12  Short offline       Completed: read failure       90%     12011         7006092
#13  Short offline       Completed: read failure       90%     11987         7006089
#14  Extended offline    Completed: read failure       90%     11964         7006092
#15  Short offline       Completed: read failure       90%     11963         7006093
#16  Short offline       Completed: read failure       90%     11939         7006089
#17  Short offline       Completed: read failure       90%     11915         7006092
#18  Short offline       Completed: read failure       90%     11891         7006092
#19  Short offline       Completed: read failure       90%     11867         7006089

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Добавлено: **2011-11-13 17:17:07**

Да, состояние не ахти... если есть свободный аналогичный винт, лично я бы скинул все на него, а потом уже различные эксперименты... а вообще, если нужно мнение норм специалистов по винтам, то видал на ixbt темку, где народ мусолит смарты винтов.

Добавлено: **2011-11-14 11:26:08**

А mhdd не судьба прогнать?

Добавлено: **2011-11-14 12:32:08**

Mox писал(а):А mhdd не судьба прогнать?

неа, так как есть вероятность добить его окончательно... любой специалист по винтам скажется, что сначала снимается дамп винта, а потом уже производятся активные действия...

Добавлено: **2011-11-14 12:37:09**

Я уже его поменял от греха подальше. Спасибо за помощь

Добавлено: **2011-11-14 13:45:12**

GhOsT_MZ писал(а):
Mox писал(а):А mhdd не судьба прогнать?
неа, так как есть вероятность добить его окончательно... любой специалист по винтам скажется, что сначала снимается дамп винта, а потом уже производятся активные действия...

Перефразирую:
0 сделать бэкап (оказывается про это надо еще говорить)
1 прогнать mhdd не судьба?

Добавлено: **2011-11-14 14:02:12**

пункт "0" - оно по умолчанию должно быть уже, сразу

Добавлено: **2011-11-14 14:04:57**

А все ли помнят про пункт "0"? Не знаю кому как, лично мне не тяжело его упомянуть на всякий случай.

forum.lissyara.su

Умирает жесткий?

Умирает жесткий?

Re: Умирает жесткий?

Re: Умирает жесткий?

Re: Умирает жесткий?

Re: Умирает жесткий?

Re: Умирает жесткий?

Re: Умирает жесткий?

Re: Умирает жесткий?

Re: Умирает жесткий?

Re: Умирает жесткий?

Re: Умирает жесткий?

Re: Умирает жесткий?

Re: Умирает жесткий?

Re: Умирает жесткий?